forecasting output and inflation: the role of asset prices · able research on forecasting economic...

42
Journal of Economic Literature Vol. XLI (September 2003) pp. 788–829 Forecasting Output and Inflation: The Role of Asset Prices JAMES H. STOCK and Mark W. Watson 1 788 1. Introduction B ecause asset prices are forward-looking, they constitute a class of potentially use- ful predictors of inflation and output growth. The premise that interest rates and asset prices contain useful information about future economic developments embodies foundational concepts of macroeconomics: Irving Fisher’s theory that the nominal inter- est rate is the real rate plus expected infla- tion; the notion that a monetary contraction produces temporarily high interest rates— an inverted yield curve—and leads to an economic slowdown; and the hypothesis that stock prices reflect the expected present dis- counted value of future earnings. Indeed, Wesley Mitchell and Arthur Burns (1938) included the Dow Jones composite index of stock prices in their initial list of leading indicators of expansions and contractions in the U.S. economy. The past fifteen years have seen consider- able research on forecasting economic activ- ity and inflation using asset prices, where we interpret asset prices broadly as including interest rates, differences between interest rates (spreads), returns, and other measures related to the value of financial or tangible assets (bonds, stocks, housing, gold, etc.). This research on asset prices as leading indi- cators arose, at least in part, from the insta- bility in the 1970s and early 1980s of fore- casts of output and inflation based on monetary aggregates and of forecasts of infla- tion based on the (non-expectational) Phillips curve. One problem with using monetary aggregates for forecasting is that they require ongoing redefinition as new financial instru- ments are introduced. In contrast, asset prices and returns typically are observed in real time with negligible measurement error. The now-large literature on forecasting using asset prices has identified a number of asset prices as leading indicators of either economic activity or inflation; these include interest rates, term spreads, stock returns, dividend yields, and exchange rates. This lit- erature is of interest from several perspec- tives. First and most obviously, those whose daily task it is to produce forecasts—notably, economists at central banks, and business 1 Stock: Department of Economics, Harvard University, and NBER. Watson: Woodrow Wilson School and Department of Economics, Princeton University, and NBER. We thank Charles Goodhart and Boris Hofmann for sharing their housing price data and John Campbell for sharing his dividend yield data. Helpful comments were provided by John Campbell, Fabio Canova, Steve Cecchetti, Frank Diebold, Stefan Gerlach, Charles Goodhart, Clive Granger, Mike McCracken, Marianne Nessen, Tony Rodrigues, Chris Sims, two referees, and participants at the June 2000 Sveriges Riksbank Conference on Asset Markets and Monetary Policy. We thank Jean-Philippe Laforte for research assistance. This research was funded in part by NSF grants SBR-9730489 and SBR-0214131.

Upload: others

Post on 19-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Journal of Economic Literature Vol XLI (September 2003) pp 788ndash829

Forecasting Output and Inflation The Role of Asset Prices

JAMES H STOCK and Mark W Watson1

788

1 Introduction

Because asset prices are forward-lookingthey constitute a class of potentially use-

ful predictors of inflation and output growthThe premise that interest rates and assetprices contain useful information aboutfuture economic developments embodiesfoundational concepts of macroeconomicsIrving Fisherrsquos theory that the nominal inter-est rate is the real rate plus expected infla-tion the notion that a monetary contractionproduces temporarily high interest ratesmdashan inverted yield curvemdashand leads to aneconomic slowdown and the hypothesis thatstock prices reflect the expected present dis-counted value of future earnings IndeedWesley Mitchell and Arthur Burns (1938)included the Dow Jones composite index ofstock prices in their initial list of leading

indicators of expansions and contractions inthe US economy

The past fifteen years have seen consider-able research on forecasting economic activ-ity and inflation using asset prices where weinterpret asset prices broadly as includinginterest rates differences between interestrates (spreads) returns and other measuresrelated to the value of financial or tangibleassets (bonds stocks housing gold etc)This research on asset prices as leading indi-cators arose at least in part from the insta-bility in the 1970s and early 1980s of fore-casts of output and inflation based onmonetary aggregates and of forecasts of infla-tion based on the (non-expectational) Phillipscurve One problem with using monetaryaggregates for forecasting is that they requireongoing redefinition as new financial instru-ments are introduced In contrast assetprices and returns typically are observed inreal time with negligible measurement error

The now-large literature on forecastingusing asset prices has identified a number ofasset prices as leading indicators of eithereconomic activity or inflation these includeinterest rates term spreads stock returnsdividend yields and exchange rates This lit-erature is of interest from several perspec-tives First and most obviously those whosedaily task it is to produce forecastsmdashnotablyeconomists at central banks and business

1 Stock Department of Economics HarvardUniversity and NBER Watson Woodrow Wilson Schooland Department of Economics Princeton University andNBER We thank Charles Goodhart and Boris Hofmannfor sharing their housing price data and John Campbell forsharing his dividend yield data Helpful comments wereprovided by John Campbell Fabio Canova SteveCecchetti Frank Diebold Stefan Gerlach CharlesGoodhart Clive Granger Mike McCracken MarianneNessen Tony Rodrigues Chris Sims two referees andparticipants at the June 2000 Sveriges RiksbankConference on Asset Markets and Monetary Policy Wethank Jean-Philippe Laforte for research assistance Thisresearch was funded in part by NSF grants SBR-9730489and SBR-0214131

Stock and Watson Forecasting Output and Inflation 789

economistsmdashneed to know which if anyasset prices provide reliable and potent fore-casts of output growth and inflation Secondknowledge of which asset prices are usefulfor forecasting and which are not consti-tutes a set of stylized facts to guide thosemacroeconomists mainly interested in under-standing the workings of modern economiesThird the empirical failure of the 1960s-vintage Phillips curve was one of the crucialdevelopments that led to rational expecta-tions macroeconomics and understanding ifand how forecasts based on asset prices breakdown could lead to further changes or refine-ments in macroeconomic models

This article begins in section 2 with a sum-mary of the econometric methods used inthis literature to evaluate predictive contentWe then review the large literature on assetprices as predictors of real economic activityand inflation This review contained in sec-tion 3 covers 93 articles and working papersand emphasizes developments during thepast fifteen years We focus exclusively onforecasts of output and inflation forecasts ofvolatility which are used mainly in financehave been reviewed recently in Ser-HuangPoon and Clive Granger (2003) Next weundertake our own empirical assessment ofthe practical value of asset prices for short- tomedium-term economic forecasting themethods data and results are presented insections 4ndash7 This analysis uses quarterly dataon as many as 43 variables from each of sevendeveloped economies (Canada FranceGermany Italy Japan the United Kingdomand the United States) over 1959ndash99 (someseries are available only for a shorter period)Most of these predictors are asset prices butfor comparison purposes we also considerselected measures of real economic activitywages prices and the money supply

Our analysis of the literature and the dataleads to four main conclusions

First some asset prices have substantialand statistically significant marginal predic-tive content for output growth at some timesin some countries Whether this predictive

content can be exploited reliably is less clearfor this requires knowing a priori what assetprice works when in which country The evi-dence that asset prices are useful for fore-casting output growth is stronger than forinflation

Second forecasts based on individualindicators are unstable Finding an indica-tor that predicts well in one period is noguarantee that it will predict well in laterperiods It appears that instability of predic-tive relations based on asset prices (likemany other candidate leading indicators) isthe norm

Third although the most common econo-metric method of identifying a potentiallyuseful predictor is to rely on in-sample sig-nificance tests such as Granger causalitytests doing so provides little assurance thatthe identified predictive relation is stableIndeed the empirical results indicate that asignificant Granger causality statistic con-tains little or no information about whetherthe indicator has been a reliable (potent andstable) predictor

Fourth simple methods for combiningthe information in the various predictorssuch as computing the median of a panel offorecasts based on individual asset pricesseem to circumvent the worst of these insta-bility problems

Some of these conclusions could be inter-preted negatively by those (ourselves in-cluded) who have worked in this area But inthis review we argue instead that they re-flect limitations of conventional models andeconometric procedures not a fundamentalabsence of predictive relations in the econ-omy the challenge is to develop methodsbetter geared to the intermittent and evolv-ing nature of these predictive relations Weexpand on these ideas in section 8

2 Methods for Evaluating Forecasts andPredictive Content

Econometric methods for measuringpredictive content can be divided into two

790 Journal of Economic Literature Vol XLI (September 2003)

groups in-sample and out-of-samplemethods

21 In-Sample Measures of Predictive Content

Suppose we want to know whether a can-didate variable X is useful for forecasting avariable of interest Y For example Xt couldbe the value of the term spread in quarter tand Yt 1 1 could be the growth rate of realGDP in the next quarter so Yt 1 1 5400ln(GDPt 1 1GDPt) 5 400Dln(GDPt 1 1)(the factor of 400 standardizes the units toannual percentage growth rates) A simpleframework for assessing predictive content isthe linear regression model relating thefuture value of Y to the current value of X

Yt 1 1 5 b0 1 b1Xt 1 ut 1 1 (1)

where b0 and b1 are unknown parametersand ut 1 1 is an error term If b1 0 thentodayrsquos value of X can be used to forecast thevalue of Y in the next period The nullhypothesis that Xt has no predictive contentcan be tested by computing the t-statistic onb1 The economic significance of Xt as a pre-dictor can be assessed using the regressionR2 and the standard error of the regression(SER) the estimate of the standard deviationof ut 1 1 Because the error term can be heteroskedastic (that is the variance of ut 1 1can depend on Xt) andor autocorrelated (ut 1 1 can be correlated with its previous val-ues) the t-statistic should be computed us-ing heteroskedasticity- and autocorrelation-consistent (HAC) standard errors

This simple framework has an importantlimitation if Yt 1 1 is serially correlated as istypically the case for time series variables itsown past values are themselves useful pre-dictors Thus a more discerning questionthan that studied using (1) is whether Xt haspredictive content for Yt 1 1 above andbeyond that contained in its past valuesMoreover additional lagged values of Xt alsomight be useful predictors This leads to theextension of (1) in which multiple lagged

values of Xt and Yt appear This multipleregression model is conventionally ex-pressed using lag polynomials Let b1(L) andb2(L) denote lag polynomials so thatb 1 ( L ) X t 5 b 1 1 X t 1 b 1 2 X t 21 1 hellip 1b1pXt 2 p 1 1 where p is the numberof lagged values of X included (we refer to Xtas a lagged value because it is lagged relativeto the variable to be forecasted which isdated t 1 1 in (1)) Then the extendedregression model is the autoregressive dis-tributed lag (ADL) model

Yt 1 1 5 b0 1 b1(L)Xt 1 b2(L)Yt 1 ut 1 1 (2)

In the context of the ADL model (2) thehypothesis that Xt has no predictive contentfor Yt 1 1 above and beyond that in lags of Ycorresponds to the hypothesis that b1(L) 50 that is that each of the lag polynomialcoefficients equals zero This hypothesis canbe tested using the (heteroskedasticity-robust) F-statistic This F-statistic is com-monly called the Granger causality test sta-tistic The economic value of the additionalforecasting content of Xt can be assessed bycomputing the partial R2 of the regression orby computing the ratio of the SER (or itssquare) of the regression (2) to that of a uni-variate autoregression (AR) which is (2) inwhich Xt and its lags are excluded

Equation (2) applies to forecasts one period ahead but it is readily modified formultistep-ahead forecasts by replacing Yt 1 1with the suitable h-period ahead value Forexample if the variable being forecast is the percentage growth of real GDP over the next eight quarters then the depen-dent variable in (2) becomes 550ln(GDPt 1 8GDPt) where the factor of50 standardizes the units to be annual per-centage growth rates In general the h-stepahead forecasting regression can be written

5 b0 + b1(L)Xt + b2(L)Yt + (3)

Because the data are overlapping theerror term in (3) is serially correlatedso the test of predictive content based on (3)

u ht 1h

u ht 1hY h

t 1 h

Y 8t 1 8

Stock and Watson Forecasting Output and Inflation 791

2 An alternative to the ldquoh-step ahead projectionrdquoapproach in (3) is to estimate a vector autoregression(VAR) or some other joint one-step ahead model for Xt andYt then iterate this model forward for h periods Almost allthe papers in the asset price-as-predictor literature use theh-step ahead projection method If the VAR is correctlyspecified then the VAR iterated forecasts are more effi-cient asymptotically but the h-step ahead projection fore-cast reduces the potential impact of specification error inthe one-step ahead model

3 Many macroeconomic time series are subject to datarevisions An additional step towards forecasting reality isto construct pseudo out-of-sample forecasts using real-time datandashin the example the vintage of the data current-ly available as of 1989IV Implementing this in practicerequires using large data sets with many vintages Suchdata are now available (Dean Croushore and Tom Stark2003) The data revision issue is however less importantin the literature of concern in this review than elsewherea virtue of asset price data is that they are measured withnegligible error in real time and are not revised nor is theCPI revised One question is whether these asset pricesshould predict early or late (ldquofinalrdquo) vintages of GDP Theimplicit view in this literature is that the best way to eval-uate a true predictive relation is to use the best (final) esti-mate of GDP and we adopt that view here

producing a sequence of pseudo out-of-sample forecasts The model estimationstage can be complex possibly entailing theestimation of a large number of models andselecting among them based on some crite-rion for example the lag length of anautoregression could be selected using aninformation criterion such as the Akaike orBayes information criteria (AIC or BIC)Critically however all model selection andestimation must be done using data availableprior to making the forecastmdashin the exam-ple using only the data available through1989IV Pseudo out-of-sample measures offorecast accuracy have several desirablecharacteristics most notably from the per-spective of this survey being their ability todetect changes in parameters towards theend of the sample3

A common way to quantify pseudo out-of-sample forecast performance is to com-pute the mean squared forecast error of acandidate forecast (forecast i) relative to abenchmark (forecast 0) For example thecandidate forecast could be based on anasset price and the benchmark could comefrom a univariate autoregression Let

and be the benchmark andith candidate pseudo out-of-sample fore-casts of made using data throughtime t Then the h-step ahead meansquared forecast error (MSFE) of forecasti relative to that of the benchmark fore-cast is

Yht 1 h

Yhit 1 h tYh

0 t 1 h t

(the test of b1(L) 5 0) should be computedusing HAC standard errors2

The stability of the coefficients in the forecasting relation (3) can be assessed by avariety of methods including testing forbreaks in coefficients and estimation of mod-els with time-varying parameters Thesemethods are used infrequently in this litera-ture so we do not discuss them here Wereturn to in-sample tests for parameter stability when we describe our empiricalmethods in section 4

The tools discussed so far examine howuseful X would have been for predicting Yhad you been able to use the regression coef-ficients estimated by the full-sample regres-sion If coefficients change over time thisfull-sample analysis can be misleading forout-of-sample forecasting Therefore evalu-ations of predictive content also should relyon statistics that are designed to simulatemore closely actual real-time forecastingwhich we refer to generally as pseudo out-of-sample forecast evaluation

22 Pseudo Out-of-Sample Measures of Predictive Content

Pseudo out-of-sample measures of predic-tive content entail simulating real-time fore-casting Suppose the researcher has quarter-ly data to make the pseudo-forecast for1990I she estimates the model using dataavailable through 1989IV then uses thisestimated model to produce the 1990I fore-cast just as she would were it truly 1989IVThis is repeated throughout the samplemoving ahead one quarter at a time thereby

792 Journal of Economic Literature Vol XLI (September 2003)

(4)

where T1 and T2 ndash h are respectively the firstand last dates over which the pseudo out-of-sample forecast is computed (so that fore-casts are made for dates t 5 T1 1 hhellip T2)

If its relative MSFE is less than one thecandidate forecast is estimated to have per-formed better than the benchmark Ofcourse this could happen simply because ofsampling variability so to determinewhether a relative MSFE less than one is sta-tistically significant requires testing thehypothesis that the population relativeMSFE 5 1 against the alternative that it isless than one When neither model has anyestimated parameters this is done using themethods of Francis Diebold and RobertMariano (1995) Kenneth West (1996) treatsthe case that at least one model has esti-mated parameters and the models are notnested (that is the benchmark model is nota special case of model i) If as is the case inmuch of the literature we review the bench-mark is nested within model i then themethods developed by Michael McCracken(1999) and Todd Clark and McCracken(2001) apply These econometric methodshave been developed only recently so arealmost entirely absent from the literaturereviewed in section 3

In the empirical analysis in sections 6 and7 we use the relative mean squared forecasterror criterion in (4) because of its familiar-ity and ease of interpretation Many varia-tions on this method are available howeverAn alternative to the recursive estimationscheme outlined above is to use a fixed num-ber of observations (a rolling window) forestimating the forecasting model Alsosquared error loss can be replaced with a dif-ferent loss function for example one couldcompute relative mean absolute error Other

1T2 2 T1 2 h 1 1 S

T22h

t5T1

1Yht1h 2 Yh

it1h t22

1T2 2 T1 2 h 1 1 S

T22h

t5T1

1Yht1h 2 Yh

0 t1h t22

statistics such as statistics that test for fore-cast encompassing or ones that assess theaccuracy of the forecasted direction ofchange can be used in addition to a relativeerror measure (see M Hashem Pesaran andSpyros Skouras 2002 and McCracken andWest 2002 for a discussion of alternativeforecast evaluation statistics) For a textbookintroduction and worked empirical examplesof pseudo out-of-sample forecast compar-ison see Stock and Watson (2003a section126) For an introduction to the recent literature on forecast comparisons seeMcCracken and West (2002)

3 Literature Survey

This survey first reviews papers that useasset prices as predictors of inflation andoroutput growth then provides a brief selec-tive summary of recent developments usingnonfinancial indicators Although we men-tion some historical precedents our reviewfocuses on developments within the past fif-teen years The section concludes with anattempt to draw some general conclusionsfrom this literature

31 Forecasts Using Asset Prices

Interest rates Short-term interest rateshave a long history of use as predictors of out-put and inflation Notably using data for theUnited States Christopher Sims (1980) foundthat including the commercial paper rate invector autoregressions (VARs) with outputinflation and money eliminated the marginalpredictive content of money for real outputThis result has been confirmed in numerousstudies eg Ben Bernanke and Alan Blinder(1992) for the United States who suggestedthat the federal funds rate is the appropriateshort-run measure of monetary policy ratherthan the growth of monetary aggregates Mostof the research involving interest rate spreadshowever has found that the level (or change)of a short rate has little marginal predictivecontent for output once spreads are included

Stock and Watson Forecasting Output and Inflation 793

The term spread and output growth Theterm spread is the difference between inter-est rates on long and short maturity debtusually government debt The literature onterm spreads uses different measures of thisspread the most common being a long gov-ernment bond rate minus a three-monthgovernment bill rate or instead the longbond rate minus an overnight rate (in theUnited States the federal funds rate)

The adage that an inverted yield curve sig-nals a recession was formalized empiricallyapparently independently by a number ofresearchers in the late 1980s includingRobert Laurent (1988 1989) CampbellHarvey (1988 1989) Stock and Watson(1989) Nai-Fu Chen (1991) and ArturoEstrella and Gikas Hardouvelis (1991) Thesestudies mainly focused on using the termspread to predict output growth (or in thecase of Harvey 1988 consumption growth)using US data Of these studies Estrella andHardouvelis (1991) provided the most com-prehensive documentation of the strong (in-sample) predictive content of the spread foroutput including its ability to predict a bi-nary recession indicator in probit regressionsThis early work focused on bivariate rela-tions with the exception of Stock and Watson(1989) who used in-sample statistics forbivariate and multivariate regressions toidentify the term spread and a default spread(the paper-bill spread discussed below) astwo historically potent leading indicators foroutput The work of Eugene Fama (1990)and Frederic Mishkin (1990ab) is alsonotable for they found that the term spreadhas (in-sample bivariate) predictive contentfor real rates especially at shorter horizons

Subsequent work focused on developingeconomic explanations for this relationdetermining whether it is stable over timewithin the United States and ascertainingwhether it holds up in international evidenceThe standard economic explanation for whythe term spread has predictive content foroutput is that the spread is an indicator of aneffective monetary policy monetary tighten-

ing results in short-term interest rates thatare high relative to long-term interest ratesand these high short rates in turn produce aneconomic slowdown (Bernanke and Blinder1992) Notably when placed within a multi-variate model the predictive content of theterm spread can change if monetary policychanges or the composition of economicshocks changes (Frank Smets and KostasTsatsaronis 1997) Movements in expectedfuture interest rates might not account for allthe predictive power of the term spreadhowever James Hamilton and Dong HeonKim (2002) suggested that the term premium(the term spread minus its predicted compo-nent under the expectations hypothesis of theterm structure of interest rates) has impor-tant predictive content for output as well

A closer examination of the US evidencehas led to the conclusion that the predictivecontent of the term spread for economicactivity has diminished since 1985 a pointmade using both pseudo out-of-sample androlling in-sample statistics by JosephHaubrich and Ann Dombrosky (1996) and byMichael Dotsey (1998) Similarly AndrewAng Monika Piazzesi and Min Wei (2003)found that during the 1990s US GDPgrowth is better predicted by the short ratethan the term spread In contrast researchthat focuses on predicting binary recessionevents (instead of output growth itself) sug-gests that the term spread might have hadsome link to the 1990 recession In particularthe ex post analyses of Estrella and Mishkin(1998) Kajal Lahiri and Jiazhou Wang (1996)and Michael Dueker (1997) respectively pro-vided probit and Markov switching modelsthat produce in-sample recession probabili-ties consistent with the term spread providingadvance warning of the 1990 US recessionThese estimated probabilities however werebased on estimated parameters that includethis recession so these are not real time orpseudo out-of-sample recession probabilities

The real-time evidence about the value ofthe spread as an indicator in the 1990 reces-sion is more mixed Laurent (1989) using the

794 Journal of Economic Literature Vol XLI (September 2003)

term spread predicted an imminent reces-sion in the United States Harvey (1989) pub-lished a forecast based on the yield curve thatsuggested ldquoa slowing of economic growth butnot zero or negative growthrdquo from the thirdquarter of 1989 through the third quarter of1990 and the Stock-Watson (1989) experi-mental recession index increased sharplywhen the yield curve flattened in late 1988and early 1989 However the business cyclepeak of July 1990 considerably postdates thepredicted period of these slowdowns asLaurent (1989) wrote ldquorecent spread datasuggest that the slowdown is likely to extendthrough the rest of 1989 and be quite signifi-cantrdquo Moreover Laurentrsquos (1989) forecastwas based in part on a judgmental interpreta-tion that the then-current inversion of theyield curve had special (nonlinear) signifi-cance signaling a downturn more severethan would be suggested by a linear modelIndeed even the largest predicted recessionprobabilities from the in-sample models aremodest 25 percent in Estrella and Mishkinrsquos(1998) probit model and 20 percent inDuekerrsquos (1997) Markov switching model forexample Harvey (1993) interpreted this evi-dence more favorably arguing that becausethe yield curve inverted moderately begin-ning in 1989II it correctly predicted a mod-erate recession six quarters later Our inter-pretation of this episode is that the termspread is an indicator of monetary policy thatmonetary policy was tight during late 1988and that yield-curve based models correctlypredicted a slowdown in 1989 This slow-down was not a recession however and theproximate cause of the recession of 1990 wasnot monetary tightening but rather specialnonmonetary circumstances in particular theinvasion of Kuwait by Iraq and the subse-quent response by US consumers (OlivierBlanchard 1993) This interpretation isbroadly similar to Benjamin Friedman andKenneth Kuttnerrsquos (1998) explanation of thefailure of the paper-bill spread to predict the1990 recession (discussed below)

Stock and Watson (2003b) examine the

behavior of various leading indicators beforeand during the US recession that began inMarch 2001 The term spread did turn neg-ative in advance of this recession the Fedfunds rate exceeded the long bond rate fromJune 2000 through March 2001 This inver-sion however was small by historical stan-dards While regressions of the form (3) pre-dicted a slower rate of economic growth inearly 2001 the predicted slowdown wasmodest four quarter growth forecasts basedon (3) fell 14 percentage points from 33percent in 2000I to a minimum of 19 per-cent in 2000IV still far from the negativegrowth of a recession

One way to get additional evidence on thereliability of the term spread as a predictor ofoutput growth is to examine evidence forother countries Harvey (1991) Zuliu Hu(1993) E Philip Davis and S G B Henry(1994) Charles Plosser and K GeertRouwenhorst (1994) Catherine Bonser-Nealand Timothy Morley (1997) Sharon Kozicki(1997) John Campbell (1999) Estrella andMishkin (1997) Estrella Anthony Rodriguesand Sebastian Schich (2003) and JosephAtta-Mensah and Greg Tkacz (2001) gener-ally conclude that the term spread has pre-dictive content for real output growth inmajor non-US OECD economies EstrellaRodrigues and Schich (2003) use in-samplebreak tests to assess coefficient stability of theforecasting relations and typically fail to rejectthe null hypothesis of stability in the cases inwhich the term spread has the greatest esti-mated predictive content (mainly long-horizon regressions) Additionally HenriBernard and Stefan Gerlach (1998) andEstrella Rodrigues and Schich (2003) pro-vide cross-country evidence on term spreadsas predictors of a binary recession indicatorfor seven non-US OECD countries Unlikemost of these papers Plosser andRouwenhorst (1994) considered multipleregressions that include the level and changeof interest rates and concluded that given thespread the short rate has little predictive con-tent for output in almost all the economies

Stock and Watson Forecasting Output and Inflation 795

they consider These studies typically used in-sample statistics and data sets that start in1970 or later Three exceptions to this gener-ally sanguine view are Davis and GabrielFagan (1997) Smets and Tsatsaronis (1997)and Fabio Canova and Gianni De Nicolo(2000) Using a pseudo out-of-sample fore-casting design Davis and Fagan (1997) findevidence of subsample instability and reportdisappointing pseudo out-of-sample forecast-ing performance across nine EU economiesSmets and Tsatsaronis (1997) find instabilityin the yield curvendashoutput relation in the1990s in the United States and GermanyCanova and De Nicolo (2000) using in-sample VAR statistics find only a limitedforecasting role for innovations to the termpremium in Germany Japan and theUnited Kingdom

Term spreads and inflation Many stud-ies including some of those already citedalso consider the predictive content of theterm spread for inflation According to theexpectations hypothesis of the term struc-ture of interest rates the forward rate (andthe term spread) should embody marketexpectations of future inflation and thefuture real rate With some notable excep-tions the papers in this literature generallyfind that there is little or no marginal infor-mation content in the nominal interest rateterm structure for future inflation Much ofthe early work which typically claims to findpredictive content did not control for laggedinflation In US data Mishkin (1990a)found no predictive content of term spreadsfor inflation at the short end of the yieldcurve although Mishkin (1990b) found pre-dictive content using spreads that involvelong bond rates Philippe Jorion and Mishkin(1991) and Mishkin (1991) reached similarconclusions using data on ten OECD coun-tries results confirmed by Gerlach (1997)for Germany using Mishkinrsquos methodologyDrawing on Jeffrey Frankelrsquos (1982) earlywork in this area Frankel and Cara Lown(1994) suggested a modification of the termspread based on a weighted average of dif-

ferent maturities that outperformed the sim-ple term spread in Mishkin-style regressions

Mishkinrsquos regressions have a single sto-chastic regressor the term spread (no lags)and in particular do not include lagged infla-tion Inflation is highly persistent howeverand Bernanke and Mishkin (1992) Estrellaand Mishkin (1997) and Kozicki (1997)examined the in-sample marginal predictivecontent of the term spread given laggedinflation Bernanke and Mishkin (1992)found little or no marginal predictive con-tent of the term spread for one-month-ahead inflation in a data set with six largeeconomies once lags of inflation are includ-ed Kozicki (1997) and Estrella and Mishkin(1997) included only a single lag of inflationbut even so they found that doing so sub-stantially reduced the marginal predictivecontent of the term spread for future infla-tion over one to two years For exampleonce lagged inflation is added Kozicki(1997) found that the spread remained sig-nificant for one-year inflation in only two ofthe ten OECD countries she studied inEstrella and Mishkinrsquos (1997) study the termspread was no longer a significant predictorat the one-year horizon in any of their fourcountries although they provided evidencefor predictive content at longer horizons

Default spreads Another strand ofresearch has focused on the predictive con-tent of default spreads primarily for realeconomic activity A default spread is the dif-ference between the interest rates onmatched maturity private debt with differentdegrees of default risk Different authorsmeasure this spread differently and thesedifferences are potentially importantBecause markets for private debt differ sub-stantially across countries and are mostdeveloped for the United States this workhas focused on the United States

In his study of the credit channel duringthe Great Depression Bernanke (1983)showed that during the interwar period theBaa-Treasury bond spread was a useful pre-dictor of industrial production growth Stock

796 Journal of Economic Literature Vol XLI (September 2003)

and Watson (1989) and Friedman andKuttner (1992) studied the default spread asa predictor of real growth in the postwarperiod they found that the spread betweencommercial paper and US Treasury bills ofthe same maturity (three or six months theldquopaper-billrdquo spread) was a potent predictorof output growth (monthly data 1959ndash88 forStock and Watson 1989 quarterly data1960ndash90 for Friedman and Kuttner 1992)Using in-sample statistics Friedman andKuttner (1992) concluded that upon con-trolling for the paper-bill spread monetaryaggregates and interest rates have little predictive content for real output a findingconfirmed by Bernanke and Blinder (1992)and Martin Feldstein and Stock (1994)

Subsequent literature focused on whetherthis predictive relationship is stable overtime Bernanke (1990) used in-sample statis-tics to confirm the strong performance ofpaper-bill spread as predictor of output butby splitting up the sample he also suggestedthat this relation weakened during the1980s This view was affirmed and assertedmore strongly by Mark Thoma and Jo AnnaGray (1994) Rik Hafer and Ali Kutan(1992) and Kenneth Emery (1996) Thomaand Gray (1994) for example found that thepaper-bill spread has strong in-sampleexplanatory power in recursive or rollingregressions but little predictive power inpseudo out-of-sample forecasting exercisesover the 1980s Emery (1996) finds little in-sample explanatory power of the paper-billspread in samples that postdate 1980 Theseauthors interpreted this as a consequence ofspecial events especially in 1973ndash74 whichcontribute to a good in-sample fit but notnecessarily good forecasting performanceDrawing on institutional considerationsJohn Duca (1999) also took this view Ducarsquos(1999) concerns echo Timothy Cookrsquos (1981)warnings about how the changing institu-tional environment and financial innovationscould substantially change markets for short-term debt and thereby alter the relationshipbetween default spreads and real activity

One obvious true out-of-sample predic-tive failure of the paper-bill spread is its fail-ure to rise sharply in advance of the 1990ndash91US recession In their post-mortemFriedman and Kuttner (1998) suggested thatthis predictive failure arose because the1990ndash91 recession was caused in large partby nonmonetary events that would not havebeen detected by the paper-bill spread Theyfurther argued that there were changes inthe commercial paper market unrelated tothe recession that also led to this predictivefailure Similarly the paper-bill spread failedto forecast the 2001 recession the paper-billspread had brief moderate spikes in October1998 October 1999 and June 2000 but thespread was small and declining from August2000 through the end of 2001 (Stock andWatson 2003b)

We know of little work examining the pre-dictive content of default spreads ineconomies other than the United StatesBernanke and Mishkin (1992) report a pre-liminary investigation but they questionedthe adequacy of their private debt interestrate data (the counterpart of the commercialpaper rate in the United States) for severalcountries Finding sufficiently long timeseries data on reliable market prices of suit-able private debt instruments has been abarrier to international comparisons on therole of the default spread

Some studies examined the predictivecontent of the default spread for inflationFriedman and Kuttner (1992) found littlepredictive content of the paper-bill spreadfor inflation using Granger causality testsConsistent with this Feldstein and Stock(1994) found that although the paper-billspread was a significant in-sample predictorof real GDP it did not significantly enterequations predicting nominal GDP

Four nonexclusive arguments have beenput forth on why the paperndashbill spread hadpredictive content for output growth duringthe 1960s and 1970s Stock and Watson(1989) suggested the predictive contentarises from expectations of default risk

Stock and Watson Forecasting Output and Inflation 797

which are in turn based on private expecta-tions of sales and profits Bernanke (1990)and Bernanke and Blinder (1992) arguedinstead that the paper-bill spread is a sensi-tive measure of monetary policy and this isthe main source of its predictive contentFriedman and Kuttner (1993ab) suggestedthat the spread is detecting influences ofsupply and demand (ie liquidity) in themarket for private debt this emphasis is sim-ilar to Cookrsquos (1981) attribution of move-ments in such spreads to supply and demandconsiderations Finally Thoma and Gray(1994) and Emery (1996) argued that thepredictive content is largely coincidental theconsequence of one-time events

There has been some examination of otherspreads in this literature Mark Gertler andLown (2000) take the view that because ofthe credit channel theory of monetary policytransmission the premise of using a defaultspread to predict future output is sound butthat the paper-bill spread is a flawed choicefor institutional reasons Instead they sug-gest using the high-yield bond (ldquojunkbondrdquo)ndashAaa spread instead The junk bondmarket was only developed in the 1980s inthe United States so this spread has a shorttime series Still Gertler and Lown (2000)present in-sample evidence that its explana-tory power was strong throughout this pe-riod This is notable because the paper-billspread (and as was noted above the termspread) have substantially reduced or no pre-dictive content for output growth in theUnited States during this period HoweverDucarsquos (1999) concerns about default spreadsin general extend to the junk bond-Aaaspread as well he suggests the spike in thejunk bond spread in the late 1980s and early1990s (which is key to this spreadrsquos signal ofthe 1990 recession) was a coincidental conse-quence of the aftermath of the thrift crisis inwhich thrifts were forced to sell their junkbond holdings in an illiquid market

Stock prices and dividend yields If theprice of a stock equals the expected dis-counted value of future earnings then stock

returns should be useful in forecasting earn-ings growth or more broadly output growthThe empirical link between stock prices andeconomic activity has been noted at leastsince Mitchell and Burns (1938) see StanleyFischer and Robert Merton (1984) andRobert Barro (1990) Upon closer inspec-tion however this link is murky Stockreturns generally do not have substantial in-sample predictive content for future outputeven in bivariate regressions with no laggeddependent variables (Fama 1981 and Harvey1989) and any predictive content is reducedby including lagged output growth Thisminimal marginal predictive content is foundboth in linear regressions predicting outputgrowth (Stock and Watson 1989 1999a) andin probit regressions of binary recessionevents (Estrella and Mishkin 1998)

In his review article Campbell (1999)shows that in a simple loglinear representa-tive agent model the log price-dividend ratioembodies rational discounted forecasts ofdividend growth rates and stock returnsmaking it an appropriate state variable to usefor forecasting But in his internationaldataset (fifteen countries sample periodsmainly 1970s to 1996 Campbell (1999)found that the log dividend price ratio has lit-tle predictive content for output This is con-sistent with the generally negative conclu-sions in the larger literature that examinesthe predictive content of stock returnsdirectly These generally negative findingsprovide a precise reprise of the witticism thatthe stock market has predicted nine of thelast four recessions

Campbell et al (2001) proposed an inter-esting variant in which the variance of stockreturns rather than the returns themselvescould have predictive content for outputgrowth Using in-sample statistics theyfound evidence that high volatility in onequarter signals low growth in the next quar-ter as it might if high volatility was associ-ated with increased doubts about short-termeconomic prospects When Hui Guo (2002)used out-of-sample statistics however the

798 Journal of Economic Literature Vol XLI (September 2003)

evidence for predictive content was substan-tially weaker These findings are consistentwith the predictive content of stock marketvolatility being stronger during someepisodes than during others

Few studies have examined the predictivecontent of stock prices for inflation One isCharles Goodhart and Boris Hofmann(2000a) who found that stock returns do nothave marginal predictive content for infla-tion in their international data set (twelvedeveloped economies quarterly data mainly1970ndash98 or shorter)

Other financial indicators Exchangerates are a channel through which inflationcan be imported in open economies In theUnited States exchange rates (or a measureof the terms of trade) have long entered con-ventional Phillips curves Robert Gordon(1982 1998) finds these exchange rates sta-tistically significant based on in-sample testsIn their international dataset howeverGoodhart and Hofmann (2000b) find thatpseudo out-of-sample forecasts of inflationusing exchange rates and lagged inflationoutperformed autoregressive forecasts inonly one or two of their seventeen countriesdepending on the horizon At least in theUS data there is also little evidence thatexchange rates predict output growth (egStock and Watson 1999a)

One problem with the nominal termstructure as a predictor of inflation is thatunder the expectations hypothesis the for-ward rate embodies forecasts of both infla-tion and future real rates In theory one caneliminate the expected future real rates byusing spreads between forward rates in theterm structures of nominal and real debt ofmatched maturity and matched bearer riskIn practice one of the few cases for whichthis is possible with time series of a reason-able length is for British index-linked bondsDavid Barr and Campbell (1997) investi-gated the (bivariate in-sample) predictivecontent of these implicit inflation expecta-tions and found that they had better predic-tive content for inflation than forward rates

obtained solely from the nominal term struc-ture They provided no evidence on Grangercausality or marginal predictive content ofthese implicit inflation expectations in multi-variate regressions

Lettau and Sydney Ludvigson (2001) pro-posed a novel indicator the log of the consumption-wealth ratio They argue thatin a representative consumer model with nostickiness in consumption the log ratio ofconsumption to total wealth (human andnonhuman) should predict the return on themarket portfolio They find empirically thattheir version of the consumptionndashwealthratio (a cointegrating residual between con-sumption of nondurables financial wealthand labor income all in logarithms) has pre-dictive content for multiyear stock returns(both real returns and excess returns) Ifconsumption is sticky it could also have pre-dictive content for consumption growthHowever Ludvigson and Charles Steindel(1999) and Lettau and Ludvigson (2000)find that this indicator does not predict con-sumption growth or income growth in theUnited States one quarter ahead

Housing constitutes a large component ofaggregate wealth and gets significant weightin the CPI in many countries More gener-ally housing is a volatile and cyclically sensi-tive sector and measures of real activity inthe housing sector are known to be usefulleading indicators of economic activity atleast in the United States (Stock and Watson1989 1999a) suggesting a broader channelby which housing prices might forecast realactivity inflation or both In the UnitedStates housing starts (a real quantity mea-sure) have some predictive content for infla-tion (Stock 1998 Stock and Watson 1999b)Studies of the predictive content of housingprices confront difficult data problems how-ever Goodhart and Hofmann (2000a) con-structed a housing price data set for twelveOECD countries (extended to seventeencountries in Goodhart and Hofmann(2000b) They found that residential housinginflation has significant in-sample marginal

Stock and Watson Forecasting Output and Inflation 799

predictive content for overall inflation in afew of the several countries they studyalthough in several countries they usedinterpolated annual data which makes thoseresults difficult to assess

Nonlinear models The foregoing discus-sion has focused on statistical models inwhich the forecasts are linear functions of thepredictors even the recession predictionmodels estimated using probit regressionsare essentially linear in the sense that the pre-dictive index (the argument of the probitfunction) is a linear function of the predic-tors Might the problems of instability andepisodically poor predictive content stemfrom inherent nonlinearities in the forecast-ing relation The evidence on this proposi-tion is limited and mixed Ted Jaditz LeighRiddick and Chera Sayers (1998) examinelinear and nonlinear models of US indus-trial production using asset price predictorsand conclude that combined nonlinear fore-casts improve upon simple linear modelsadditionally Greg Tkacz (2001) reportsimprovements of nonlinear models over lin-ear models for forecasting Canadian GDPOn the other hand John Galbraith and Tkacz(2000) find limited international evidence ofnonlinearity in the outputndashterm spread rela-tion Similarly in their pseudo out-of-samplecomparison of VAR to multivariate neuralnetwork forecasts Norman Swanson andHalbert White (1997) concluded that the lin-ear forecasts generally performed better forvarious measures of US economic activityand inflation Swanson and White (1995)reached similar conclusions when forecastinginterest rates Given this limited evidence wecannot rule out the possibility that the rightnonlinear model will produce stable and reli-able forecasts of output and inflation usinginterest rates but this ldquorightrdquo nonlinearmodel has yet to be found

32 Forecasts Using Nonfinancial Variables

The literature on forecasting output andinflation with nonfinancial variables is mas-

sive see Stock and Watson (1999a) for anextensive review of the US evidence Thissection highlights a few recent studies onthis topic

The use of nonfinancial variables to forecastinflation has to a large extent focused onidentifying suitable measures of output gapsthat is estimating generalized Phillips curvesIn the United States the unemployment-based Phillips curve with a constant non-accelerating inflation rate of unemployment(NAIRU) has recently been unstable predict-ing accelerating inflation during a time thatinflation was in fact low and steady or fallingThis instability has been widely documentedsee for example Gordon (1997 1998) andDouglas Staiger Stock and Watson (1997ab2001) One reaction to this instability has beento suggest that the NAIRU was falling in theUnited States during the 1990s Mechanicallythis keeps the unemployment-based Phillipscurve on track and it makes sense in the con-text of changes in the US labor market and inthe economy generally cf Lawrence Katzand Alan Krueger (1999) However an impre-cisely estimated time-varying NAIRU makesforecasting using the unemployment-basedPhillips curve problematic

A different reaction to this time variation inthe NAIRU has been to see if there are alter-native predictive relations that have beenmore stable Staiger Stock and Watson(1997a) considered 71 candidate leadingindicators of inflation both financial and non-financial (quarterly US) and in a similar butmore thorough exercise Stock and Watson(1999b) considered 167 candidate leadingindicators (monthly US) They found a fewindicators that have been stable predictors ofinflation the prime example being the capac-ity utilization rate Gordon (1998) and Stock(1998) confirmed the accuracy of recent USinflation forecasts based on the capacity uti-lization rate Stock and Watson (1999b) alsosuggested an alternative Phillips-curve-typeforecast based on a single aggregate activityindex computed using 85 individual measuresof real aggregate activity

800 Journal of Economic Literature Vol XLI (September 2003)

Recently Andrew Atkeson and LeeOhanian (2001) challenged the usefulness ofall inflation forecasts based on the Phillipscurve and its variants They showed that forthe United States from 1984 to 2001 pub-lished inflation forecasts and pseudo out-of-sample Phillips curve forecasts did not beata seasonal random walk forecast of annualinflation Their finding poses a significantchallenge to all attempts to forecast inflationand we return to it in our empirical analysis

The international evidence on the suitabil-ity of output gaps and the Phillips Curve forforecasting inflation is mixed Simple unem-ployment-based models with a constantNAIRU fail in Europe which is one way tostate the phenomenon of so-called hysteresisin the unemployment rate More sophisti-cated and flexible statistical tools for esti-mating the NAIRU can improve in-samplefits for the European data (Thomas Laubach2001) but their value for forecasting is ques-tionable because of imprecision in the esti-mated NAIRU at the end of the sampleSimilarly inflation forecasts based on an out-put gap rather than the unemployment rateface the practical problem of estimating thegap at the end of the sample which neces-sarily introduces a one-sided estimate andassociated imprecision Evidence inMassimiliano Marcellino Stock and Watson(2000) suggests that the ability of output gapmodels to forecast inflation in Europe ismore limited than in the United States

Finally there is evidence (from US data)that the inflation process itself as well as pre-dictive relations based on it is time-varyingWilliam Brainard and George Perry (2000)and Timothy Cogley and Thomas Sargent(2001 2002) suggested that the persistence ofUS inflation was high in the 1970s and 1980sbut subsequently declined although this con-clusion appears to be sensitive to the methodused to measure persistence (FredericPivetta and Ricardo Reis 2002) GeorgeAkerlof William Dickens and Perry (2000)provided a model based on near-rationalbehavior that motivates a nonlinear Phillips

curve which they interpreted as consistentwith the Brainard and Perry (2000) evidence

In a similar vein Stephen Cecchetti RitaChu and Charles Steindel (2000) performeda pseudo out-of-sample forecasting experi-ment on various candidate leading indicatorsof inflation from 1985 to 1998 in the UnitedStates including interest rates term anddefault spreads and several nonfinancialindicators They concluded that none ofthese indicators financial or nonfinancialreliably predicts inflation in bivariate fore-casting models and that there are very fewyears in which financial variables outperforma simple autoregression Because theyassessed performance on a year-by-yearbasis these findings have great samplingvariability and it is difficult to know howmuch of this is due to true instability Stilltheir findings are consistent with Stock andWatsonrsquos (1996) results based on formal sta-bility tests that time variation in thesereduced form bivariate predictive relationsis widespread in the US data

33 Discussion

An econometrician might quibble withsome aspects of this literature Many ofthese studies fail to include lagged endoge-nous variables and thus do not assess mar-ginal predictive content Results oftenchange when marginal predictive content isconsidered (the predictive content of theterm spread for inflation is an example)Many of the regressions involve overlappingreturns and when the overlap period is largerelative to the sample size the distribution ofin-sample t-statistics and R2s becomes non-standard Some regressors such as the divi-dend yield and the term spread are highlypersistent and even if they do not have aunit root this persistence causes conventionalinference methods to break down These latter two problems combined make it evenmore difficult to do reliable inference andvery few of these studies mention far lesstackle either of these difficulties with their

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 2: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 789

economistsmdashneed to know which if anyasset prices provide reliable and potent fore-casts of output growth and inflation Secondknowledge of which asset prices are usefulfor forecasting and which are not consti-tutes a set of stylized facts to guide thosemacroeconomists mainly interested in under-standing the workings of modern economiesThird the empirical failure of the 1960s-vintage Phillips curve was one of the crucialdevelopments that led to rational expecta-tions macroeconomics and understanding ifand how forecasts based on asset prices breakdown could lead to further changes or refine-ments in macroeconomic models

This article begins in section 2 with a sum-mary of the econometric methods used inthis literature to evaluate predictive contentWe then review the large literature on assetprices as predictors of real economic activityand inflation This review contained in sec-tion 3 covers 93 articles and working papersand emphasizes developments during thepast fifteen years We focus exclusively onforecasts of output and inflation forecasts ofvolatility which are used mainly in financehave been reviewed recently in Ser-HuangPoon and Clive Granger (2003) Next weundertake our own empirical assessment ofthe practical value of asset prices for short- tomedium-term economic forecasting themethods data and results are presented insections 4ndash7 This analysis uses quarterly dataon as many as 43 variables from each of sevendeveloped economies (Canada FranceGermany Italy Japan the United Kingdomand the United States) over 1959ndash99 (someseries are available only for a shorter period)Most of these predictors are asset prices butfor comparison purposes we also considerselected measures of real economic activitywages prices and the money supply

Our analysis of the literature and the dataleads to four main conclusions

First some asset prices have substantialand statistically significant marginal predic-tive content for output growth at some timesin some countries Whether this predictive

content can be exploited reliably is less clearfor this requires knowing a priori what assetprice works when in which country The evi-dence that asset prices are useful for fore-casting output growth is stronger than forinflation

Second forecasts based on individualindicators are unstable Finding an indica-tor that predicts well in one period is noguarantee that it will predict well in laterperiods It appears that instability of predic-tive relations based on asset prices (likemany other candidate leading indicators) isthe norm

Third although the most common econo-metric method of identifying a potentiallyuseful predictor is to rely on in-sample sig-nificance tests such as Granger causalitytests doing so provides little assurance thatthe identified predictive relation is stableIndeed the empirical results indicate that asignificant Granger causality statistic con-tains little or no information about whetherthe indicator has been a reliable (potent andstable) predictor

Fourth simple methods for combiningthe information in the various predictorssuch as computing the median of a panel offorecasts based on individual asset pricesseem to circumvent the worst of these insta-bility problems

Some of these conclusions could be inter-preted negatively by those (ourselves in-cluded) who have worked in this area But inthis review we argue instead that they re-flect limitations of conventional models andeconometric procedures not a fundamentalabsence of predictive relations in the econ-omy the challenge is to develop methodsbetter geared to the intermittent and evolv-ing nature of these predictive relations Weexpand on these ideas in section 8

2 Methods for Evaluating Forecasts andPredictive Content

Econometric methods for measuringpredictive content can be divided into two

790 Journal of Economic Literature Vol XLI (September 2003)

groups in-sample and out-of-samplemethods

21 In-Sample Measures of Predictive Content

Suppose we want to know whether a can-didate variable X is useful for forecasting avariable of interest Y For example Xt couldbe the value of the term spread in quarter tand Yt 1 1 could be the growth rate of realGDP in the next quarter so Yt 1 1 5400ln(GDPt 1 1GDPt) 5 400Dln(GDPt 1 1)(the factor of 400 standardizes the units toannual percentage growth rates) A simpleframework for assessing predictive content isthe linear regression model relating thefuture value of Y to the current value of X

Yt 1 1 5 b0 1 b1Xt 1 ut 1 1 (1)

where b0 and b1 are unknown parametersand ut 1 1 is an error term If b1 0 thentodayrsquos value of X can be used to forecast thevalue of Y in the next period The nullhypothesis that Xt has no predictive contentcan be tested by computing the t-statistic onb1 The economic significance of Xt as a pre-dictor can be assessed using the regressionR2 and the standard error of the regression(SER) the estimate of the standard deviationof ut 1 1 Because the error term can be heteroskedastic (that is the variance of ut 1 1can depend on Xt) andor autocorrelated (ut 1 1 can be correlated with its previous val-ues) the t-statistic should be computed us-ing heteroskedasticity- and autocorrelation-consistent (HAC) standard errors

This simple framework has an importantlimitation if Yt 1 1 is serially correlated as istypically the case for time series variables itsown past values are themselves useful pre-dictors Thus a more discerning questionthan that studied using (1) is whether Xt haspredictive content for Yt 1 1 above andbeyond that contained in its past valuesMoreover additional lagged values of Xt alsomight be useful predictors This leads to theextension of (1) in which multiple lagged

values of Xt and Yt appear This multipleregression model is conventionally ex-pressed using lag polynomials Let b1(L) andb2(L) denote lag polynomials so thatb 1 ( L ) X t 5 b 1 1 X t 1 b 1 2 X t 21 1 hellip 1b1pXt 2 p 1 1 where p is the numberof lagged values of X included (we refer to Xtas a lagged value because it is lagged relativeto the variable to be forecasted which isdated t 1 1 in (1)) Then the extendedregression model is the autoregressive dis-tributed lag (ADL) model

Yt 1 1 5 b0 1 b1(L)Xt 1 b2(L)Yt 1 ut 1 1 (2)

In the context of the ADL model (2) thehypothesis that Xt has no predictive contentfor Yt 1 1 above and beyond that in lags of Ycorresponds to the hypothesis that b1(L) 50 that is that each of the lag polynomialcoefficients equals zero This hypothesis canbe tested using the (heteroskedasticity-robust) F-statistic This F-statistic is com-monly called the Granger causality test sta-tistic The economic value of the additionalforecasting content of Xt can be assessed bycomputing the partial R2 of the regression orby computing the ratio of the SER (or itssquare) of the regression (2) to that of a uni-variate autoregression (AR) which is (2) inwhich Xt and its lags are excluded

Equation (2) applies to forecasts one period ahead but it is readily modified formultistep-ahead forecasts by replacing Yt 1 1with the suitable h-period ahead value Forexample if the variable being forecast is the percentage growth of real GDP over the next eight quarters then the depen-dent variable in (2) becomes 550ln(GDPt 1 8GDPt) where the factor of50 standardizes the units to be annual per-centage growth rates In general the h-stepahead forecasting regression can be written

5 b0 + b1(L)Xt + b2(L)Yt + (3)

Because the data are overlapping theerror term in (3) is serially correlatedso the test of predictive content based on (3)

u ht 1h

u ht 1hY h

t 1 h

Y 8t 1 8

Stock and Watson Forecasting Output and Inflation 791

2 An alternative to the ldquoh-step ahead projectionrdquoapproach in (3) is to estimate a vector autoregression(VAR) or some other joint one-step ahead model for Xt andYt then iterate this model forward for h periods Almost allthe papers in the asset price-as-predictor literature use theh-step ahead projection method If the VAR is correctlyspecified then the VAR iterated forecasts are more effi-cient asymptotically but the h-step ahead projection fore-cast reduces the potential impact of specification error inthe one-step ahead model

3 Many macroeconomic time series are subject to datarevisions An additional step towards forecasting reality isto construct pseudo out-of-sample forecasts using real-time datandashin the example the vintage of the data current-ly available as of 1989IV Implementing this in practicerequires using large data sets with many vintages Suchdata are now available (Dean Croushore and Tom Stark2003) The data revision issue is however less importantin the literature of concern in this review than elsewherea virtue of asset price data is that they are measured withnegligible error in real time and are not revised nor is theCPI revised One question is whether these asset pricesshould predict early or late (ldquofinalrdquo) vintages of GDP Theimplicit view in this literature is that the best way to eval-uate a true predictive relation is to use the best (final) esti-mate of GDP and we adopt that view here

producing a sequence of pseudo out-of-sample forecasts The model estimationstage can be complex possibly entailing theestimation of a large number of models andselecting among them based on some crite-rion for example the lag length of anautoregression could be selected using aninformation criterion such as the Akaike orBayes information criteria (AIC or BIC)Critically however all model selection andestimation must be done using data availableprior to making the forecastmdashin the exam-ple using only the data available through1989IV Pseudo out-of-sample measures offorecast accuracy have several desirablecharacteristics most notably from the per-spective of this survey being their ability todetect changes in parameters towards theend of the sample3

A common way to quantify pseudo out-of-sample forecast performance is to com-pute the mean squared forecast error of acandidate forecast (forecast i) relative to abenchmark (forecast 0) For example thecandidate forecast could be based on anasset price and the benchmark could comefrom a univariate autoregression Let

and be the benchmark andith candidate pseudo out-of-sample fore-casts of made using data throughtime t Then the h-step ahead meansquared forecast error (MSFE) of forecasti relative to that of the benchmark fore-cast is

Yht 1 h

Yhit 1 h tYh

0 t 1 h t

(the test of b1(L) 5 0) should be computedusing HAC standard errors2

The stability of the coefficients in the forecasting relation (3) can be assessed by avariety of methods including testing forbreaks in coefficients and estimation of mod-els with time-varying parameters Thesemethods are used infrequently in this litera-ture so we do not discuss them here Wereturn to in-sample tests for parameter stability when we describe our empiricalmethods in section 4

The tools discussed so far examine howuseful X would have been for predicting Yhad you been able to use the regression coef-ficients estimated by the full-sample regres-sion If coefficients change over time thisfull-sample analysis can be misleading forout-of-sample forecasting Therefore evalu-ations of predictive content also should relyon statistics that are designed to simulatemore closely actual real-time forecastingwhich we refer to generally as pseudo out-of-sample forecast evaluation

22 Pseudo Out-of-Sample Measures of Predictive Content

Pseudo out-of-sample measures of predic-tive content entail simulating real-time fore-casting Suppose the researcher has quarter-ly data to make the pseudo-forecast for1990I she estimates the model using dataavailable through 1989IV then uses thisestimated model to produce the 1990I fore-cast just as she would were it truly 1989IVThis is repeated throughout the samplemoving ahead one quarter at a time thereby

792 Journal of Economic Literature Vol XLI (September 2003)

(4)

where T1 and T2 ndash h are respectively the firstand last dates over which the pseudo out-of-sample forecast is computed (so that fore-casts are made for dates t 5 T1 1 hhellip T2)

If its relative MSFE is less than one thecandidate forecast is estimated to have per-formed better than the benchmark Ofcourse this could happen simply because ofsampling variability so to determinewhether a relative MSFE less than one is sta-tistically significant requires testing thehypothesis that the population relativeMSFE 5 1 against the alternative that it isless than one When neither model has anyestimated parameters this is done using themethods of Francis Diebold and RobertMariano (1995) Kenneth West (1996) treatsthe case that at least one model has esti-mated parameters and the models are notnested (that is the benchmark model is nota special case of model i) If as is the case inmuch of the literature we review the bench-mark is nested within model i then themethods developed by Michael McCracken(1999) and Todd Clark and McCracken(2001) apply These econometric methodshave been developed only recently so arealmost entirely absent from the literaturereviewed in section 3

In the empirical analysis in sections 6 and7 we use the relative mean squared forecasterror criterion in (4) because of its familiar-ity and ease of interpretation Many varia-tions on this method are available howeverAn alternative to the recursive estimationscheme outlined above is to use a fixed num-ber of observations (a rolling window) forestimating the forecasting model Alsosquared error loss can be replaced with a dif-ferent loss function for example one couldcompute relative mean absolute error Other

1T2 2 T1 2 h 1 1 S

T22h

t5T1

1Yht1h 2 Yh

it1h t22

1T2 2 T1 2 h 1 1 S

T22h

t5T1

1Yht1h 2 Yh

0 t1h t22

statistics such as statistics that test for fore-cast encompassing or ones that assess theaccuracy of the forecasted direction ofchange can be used in addition to a relativeerror measure (see M Hashem Pesaran andSpyros Skouras 2002 and McCracken andWest 2002 for a discussion of alternativeforecast evaluation statistics) For a textbookintroduction and worked empirical examplesof pseudo out-of-sample forecast compar-ison see Stock and Watson (2003a section126) For an introduction to the recent literature on forecast comparisons seeMcCracken and West (2002)

3 Literature Survey

This survey first reviews papers that useasset prices as predictors of inflation andoroutput growth then provides a brief selec-tive summary of recent developments usingnonfinancial indicators Although we men-tion some historical precedents our reviewfocuses on developments within the past fif-teen years The section concludes with anattempt to draw some general conclusionsfrom this literature

31 Forecasts Using Asset Prices

Interest rates Short-term interest rateshave a long history of use as predictors of out-put and inflation Notably using data for theUnited States Christopher Sims (1980) foundthat including the commercial paper rate invector autoregressions (VARs) with outputinflation and money eliminated the marginalpredictive content of money for real outputThis result has been confirmed in numerousstudies eg Ben Bernanke and Alan Blinder(1992) for the United States who suggestedthat the federal funds rate is the appropriateshort-run measure of monetary policy ratherthan the growth of monetary aggregates Mostof the research involving interest rate spreadshowever has found that the level (or change)of a short rate has little marginal predictivecontent for output once spreads are included

Stock and Watson Forecasting Output and Inflation 793

The term spread and output growth Theterm spread is the difference between inter-est rates on long and short maturity debtusually government debt The literature onterm spreads uses different measures of thisspread the most common being a long gov-ernment bond rate minus a three-monthgovernment bill rate or instead the longbond rate minus an overnight rate (in theUnited States the federal funds rate)

The adage that an inverted yield curve sig-nals a recession was formalized empiricallyapparently independently by a number ofresearchers in the late 1980s includingRobert Laurent (1988 1989) CampbellHarvey (1988 1989) Stock and Watson(1989) Nai-Fu Chen (1991) and ArturoEstrella and Gikas Hardouvelis (1991) Thesestudies mainly focused on using the termspread to predict output growth (or in thecase of Harvey 1988 consumption growth)using US data Of these studies Estrella andHardouvelis (1991) provided the most com-prehensive documentation of the strong (in-sample) predictive content of the spread foroutput including its ability to predict a bi-nary recession indicator in probit regressionsThis early work focused on bivariate rela-tions with the exception of Stock and Watson(1989) who used in-sample statistics forbivariate and multivariate regressions toidentify the term spread and a default spread(the paper-bill spread discussed below) astwo historically potent leading indicators foroutput The work of Eugene Fama (1990)and Frederic Mishkin (1990ab) is alsonotable for they found that the term spreadhas (in-sample bivariate) predictive contentfor real rates especially at shorter horizons

Subsequent work focused on developingeconomic explanations for this relationdetermining whether it is stable over timewithin the United States and ascertainingwhether it holds up in international evidenceThe standard economic explanation for whythe term spread has predictive content foroutput is that the spread is an indicator of aneffective monetary policy monetary tighten-

ing results in short-term interest rates thatare high relative to long-term interest ratesand these high short rates in turn produce aneconomic slowdown (Bernanke and Blinder1992) Notably when placed within a multi-variate model the predictive content of theterm spread can change if monetary policychanges or the composition of economicshocks changes (Frank Smets and KostasTsatsaronis 1997) Movements in expectedfuture interest rates might not account for allthe predictive power of the term spreadhowever James Hamilton and Dong HeonKim (2002) suggested that the term premium(the term spread minus its predicted compo-nent under the expectations hypothesis of theterm structure of interest rates) has impor-tant predictive content for output as well

A closer examination of the US evidencehas led to the conclusion that the predictivecontent of the term spread for economicactivity has diminished since 1985 a pointmade using both pseudo out-of-sample androlling in-sample statistics by JosephHaubrich and Ann Dombrosky (1996) and byMichael Dotsey (1998) Similarly AndrewAng Monika Piazzesi and Min Wei (2003)found that during the 1990s US GDPgrowth is better predicted by the short ratethan the term spread In contrast researchthat focuses on predicting binary recessionevents (instead of output growth itself) sug-gests that the term spread might have hadsome link to the 1990 recession In particularthe ex post analyses of Estrella and Mishkin(1998) Kajal Lahiri and Jiazhou Wang (1996)and Michael Dueker (1997) respectively pro-vided probit and Markov switching modelsthat produce in-sample recession probabili-ties consistent with the term spread providingadvance warning of the 1990 US recessionThese estimated probabilities however werebased on estimated parameters that includethis recession so these are not real time orpseudo out-of-sample recession probabilities

The real-time evidence about the value ofthe spread as an indicator in the 1990 reces-sion is more mixed Laurent (1989) using the

794 Journal of Economic Literature Vol XLI (September 2003)

term spread predicted an imminent reces-sion in the United States Harvey (1989) pub-lished a forecast based on the yield curve thatsuggested ldquoa slowing of economic growth butnot zero or negative growthrdquo from the thirdquarter of 1989 through the third quarter of1990 and the Stock-Watson (1989) experi-mental recession index increased sharplywhen the yield curve flattened in late 1988and early 1989 However the business cyclepeak of July 1990 considerably postdates thepredicted period of these slowdowns asLaurent (1989) wrote ldquorecent spread datasuggest that the slowdown is likely to extendthrough the rest of 1989 and be quite signifi-cantrdquo Moreover Laurentrsquos (1989) forecastwas based in part on a judgmental interpreta-tion that the then-current inversion of theyield curve had special (nonlinear) signifi-cance signaling a downturn more severethan would be suggested by a linear modelIndeed even the largest predicted recessionprobabilities from the in-sample models aremodest 25 percent in Estrella and Mishkinrsquos(1998) probit model and 20 percent inDuekerrsquos (1997) Markov switching model forexample Harvey (1993) interpreted this evi-dence more favorably arguing that becausethe yield curve inverted moderately begin-ning in 1989II it correctly predicted a mod-erate recession six quarters later Our inter-pretation of this episode is that the termspread is an indicator of monetary policy thatmonetary policy was tight during late 1988and that yield-curve based models correctlypredicted a slowdown in 1989 This slow-down was not a recession however and theproximate cause of the recession of 1990 wasnot monetary tightening but rather specialnonmonetary circumstances in particular theinvasion of Kuwait by Iraq and the subse-quent response by US consumers (OlivierBlanchard 1993) This interpretation isbroadly similar to Benjamin Friedman andKenneth Kuttnerrsquos (1998) explanation of thefailure of the paper-bill spread to predict the1990 recession (discussed below)

Stock and Watson (2003b) examine the

behavior of various leading indicators beforeand during the US recession that began inMarch 2001 The term spread did turn neg-ative in advance of this recession the Fedfunds rate exceeded the long bond rate fromJune 2000 through March 2001 This inver-sion however was small by historical stan-dards While regressions of the form (3) pre-dicted a slower rate of economic growth inearly 2001 the predicted slowdown wasmodest four quarter growth forecasts basedon (3) fell 14 percentage points from 33percent in 2000I to a minimum of 19 per-cent in 2000IV still far from the negativegrowth of a recession

One way to get additional evidence on thereliability of the term spread as a predictor ofoutput growth is to examine evidence forother countries Harvey (1991) Zuliu Hu(1993) E Philip Davis and S G B Henry(1994) Charles Plosser and K GeertRouwenhorst (1994) Catherine Bonser-Nealand Timothy Morley (1997) Sharon Kozicki(1997) John Campbell (1999) Estrella andMishkin (1997) Estrella Anthony Rodriguesand Sebastian Schich (2003) and JosephAtta-Mensah and Greg Tkacz (2001) gener-ally conclude that the term spread has pre-dictive content for real output growth inmajor non-US OECD economies EstrellaRodrigues and Schich (2003) use in-samplebreak tests to assess coefficient stability of theforecasting relations and typically fail to rejectthe null hypothesis of stability in the cases inwhich the term spread has the greatest esti-mated predictive content (mainly long-horizon regressions) Additionally HenriBernard and Stefan Gerlach (1998) andEstrella Rodrigues and Schich (2003) pro-vide cross-country evidence on term spreadsas predictors of a binary recession indicatorfor seven non-US OECD countries Unlikemost of these papers Plosser andRouwenhorst (1994) considered multipleregressions that include the level and changeof interest rates and concluded that given thespread the short rate has little predictive con-tent for output in almost all the economies

Stock and Watson Forecasting Output and Inflation 795

they consider These studies typically used in-sample statistics and data sets that start in1970 or later Three exceptions to this gener-ally sanguine view are Davis and GabrielFagan (1997) Smets and Tsatsaronis (1997)and Fabio Canova and Gianni De Nicolo(2000) Using a pseudo out-of-sample fore-casting design Davis and Fagan (1997) findevidence of subsample instability and reportdisappointing pseudo out-of-sample forecast-ing performance across nine EU economiesSmets and Tsatsaronis (1997) find instabilityin the yield curvendashoutput relation in the1990s in the United States and GermanyCanova and De Nicolo (2000) using in-sample VAR statistics find only a limitedforecasting role for innovations to the termpremium in Germany Japan and theUnited Kingdom

Term spreads and inflation Many stud-ies including some of those already citedalso consider the predictive content of theterm spread for inflation According to theexpectations hypothesis of the term struc-ture of interest rates the forward rate (andthe term spread) should embody marketexpectations of future inflation and thefuture real rate With some notable excep-tions the papers in this literature generallyfind that there is little or no marginal infor-mation content in the nominal interest rateterm structure for future inflation Much ofthe early work which typically claims to findpredictive content did not control for laggedinflation In US data Mishkin (1990a)found no predictive content of term spreadsfor inflation at the short end of the yieldcurve although Mishkin (1990b) found pre-dictive content using spreads that involvelong bond rates Philippe Jorion and Mishkin(1991) and Mishkin (1991) reached similarconclusions using data on ten OECD coun-tries results confirmed by Gerlach (1997)for Germany using Mishkinrsquos methodologyDrawing on Jeffrey Frankelrsquos (1982) earlywork in this area Frankel and Cara Lown(1994) suggested a modification of the termspread based on a weighted average of dif-

ferent maturities that outperformed the sim-ple term spread in Mishkin-style regressions

Mishkinrsquos regressions have a single sto-chastic regressor the term spread (no lags)and in particular do not include lagged infla-tion Inflation is highly persistent howeverand Bernanke and Mishkin (1992) Estrellaand Mishkin (1997) and Kozicki (1997)examined the in-sample marginal predictivecontent of the term spread given laggedinflation Bernanke and Mishkin (1992)found little or no marginal predictive con-tent of the term spread for one-month-ahead inflation in a data set with six largeeconomies once lags of inflation are includ-ed Kozicki (1997) and Estrella and Mishkin(1997) included only a single lag of inflationbut even so they found that doing so sub-stantially reduced the marginal predictivecontent of the term spread for future infla-tion over one to two years For exampleonce lagged inflation is added Kozicki(1997) found that the spread remained sig-nificant for one-year inflation in only two ofthe ten OECD countries she studied inEstrella and Mishkinrsquos (1997) study the termspread was no longer a significant predictorat the one-year horizon in any of their fourcountries although they provided evidencefor predictive content at longer horizons

Default spreads Another strand ofresearch has focused on the predictive con-tent of default spreads primarily for realeconomic activity A default spread is the dif-ference between the interest rates onmatched maturity private debt with differentdegrees of default risk Different authorsmeasure this spread differently and thesedifferences are potentially importantBecause markets for private debt differ sub-stantially across countries and are mostdeveloped for the United States this workhas focused on the United States

In his study of the credit channel duringthe Great Depression Bernanke (1983)showed that during the interwar period theBaa-Treasury bond spread was a useful pre-dictor of industrial production growth Stock

796 Journal of Economic Literature Vol XLI (September 2003)

and Watson (1989) and Friedman andKuttner (1992) studied the default spread asa predictor of real growth in the postwarperiod they found that the spread betweencommercial paper and US Treasury bills ofthe same maturity (three or six months theldquopaper-billrdquo spread) was a potent predictorof output growth (monthly data 1959ndash88 forStock and Watson 1989 quarterly data1960ndash90 for Friedman and Kuttner 1992)Using in-sample statistics Friedman andKuttner (1992) concluded that upon con-trolling for the paper-bill spread monetaryaggregates and interest rates have little predictive content for real output a findingconfirmed by Bernanke and Blinder (1992)and Martin Feldstein and Stock (1994)

Subsequent literature focused on whetherthis predictive relationship is stable overtime Bernanke (1990) used in-sample statis-tics to confirm the strong performance ofpaper-bill spread as predictor of output butby splitting up the sample he also suggestedthat this relation weakened during the1980s This view was affirmed and assertedmore strongly by Mark Thoma and Jo AnnaGray (1994) Rik Hafer and Ali Kutan(1992) and Kenneth Emery (1996) Thomaand Gray (1994) for example found that thepaper-bill spread has strong in-sampleexplanatory power in recursive or rollingregressions but little predictive power inpseudo out-of-sample forecasting exercisesover the 1980s Emery (1996) finds little in-sample explanatory power of the paper-billspread in samples that postdate 1980 Theseauthors interpreted this as a consequence ofspecial events especially in 1973ndash74 whichcontribute to a good in-sample fit but notnecessarily good forecasting performanceDrawing on institutional considerationsJohn Duca (1999) also took this view Ducarsquos(1999) concerns echo Timothy Cookrsquos (1981)warnings about how the changing institu-tional environment and financial innovationscould substantially change markets for short-term debt and thereby alter the relationshipbetween default spreads and real activity

One obvious true out-of-sample predic-tive failure of the paper-bill spread is its fail-ure to rise sharply in advance of the 1990ndash91US recession In their post-mortemFriedman and Kuttner (1998) suggested thatthis predictive failure arose because the1990ndash91 recession was caused in large partby nonmonetary events that would not havebeen detected by the paper-bill spread Theyfurther argued that there were changes inthe commercial paper market unrelated tothe recession that also led to this predictivefailure Similarly the paper-bill spread failedto forecast the 2001 recession the paper-billspread had brief moderate spikes in October1998 October 1999 and June 2000 but thespread was small and declining from August2000 through the end of 2001 (Stock andWatson 2003b)

We know of little work examining the pre-dictive content of default spreads ineconomies other than the United StatesBernanke and Mishkin (1992) report a pre-liminary investigation but they questionedthe adequacy of their private debt interestrate data (the counterpart of the commercialpaper rate in the United States) for severalcountries Finding sufficiently long timeseries data on reliable market prices of suit-able private debt instruments has been abarrier to international comparisons on therole of the default spread

Some studies examined the predictivecontent of the default spread for inflationFriedman and Kuttner (1992) found littlepredictive content of the paper-bill spreadfor inflation using Granger causality testsConsistent with this Feldstein and Stock(1994) found that although the paper-billspread was a significant in-sample predictorof real GDP it did not significantly enterequations predicting nominal GDP

Four nonexclusive arguments have beenput forth on why the paperndashbill spread hadpredictive content for output growth duringthe 1960s and 1970s Stock and Watson(1989) suggested the predictive contentarises from expectations of default risk

Stock and Watson Forecasting Output and Inflation 797

which are in turn based on private expecta-tions of sales and profits Bernanke (1990)and Bernanke and Blinder (1992) arguedinstead that the paper-bill spread is a sensi-tive measure of monetary policy and this isthe main source of its predictive contentFriedman and Kuttner (1993ab) suggestedthat the spread is detecting influences ofsupply and demand (ie liquidity) in themarket for private debt this emphasis is sim-ilar to Cookrsquos (1981) attribution of move-ments in such spreads to supply and demandconsiderations Finally Thoma and Gray(1994) and Emery (1996) argued that thepredictive content is largely coincidental theconsequence of one-time events

There has been some examination of otherspreads in this literature Mark Gertler andLown (2000) take the view that because ofthe credit channel theory of monetary policytransmission the premise of using a defaultspread to predict future output is sound butthat the paper-bill spread is a flawed choicefor institutional reasons Instead they sug-gest using the high-yield bond (ldquojunkbondrdquo)ndashAaa spread instead The junk bondmarket was only developed in the 1980s inthe United States so this spread has a shorttime series Still Gertler and Lown (2000)present in-sample evidence that its explana-tory power was strong throughout this pe-riod This is notable because the paper-billspread (and as was noted above the termspread) have substantially reduced or no pre-dictive content for output growth in theUnited States during this period HoweverDucarsquos (1999) concerns about default spreadsin general extend to the junk bond-Aaaspread as well he suggests the spike in thejunk bond spread in the late 1980s and early1990s (which is key to this spreadrsquos signal ofthe 1990 recession) was a coincidental conse-quence of the aftermath of the thrift crisis inwhich thrifts were forced to sell their junkbond holdings in an illiquid market

Stock prices and dividend yields If theprice of a stock equals the expected dis-counted value of future earnings then stock

returns should be useful in forecasting earn-ings growth or more broadly output growthThe empirical link between stock prices andeconomic activity has been noted at leastsince Mitchell and Burns (1938) see StanleyFischer and Robert Merton (1984) andRobert Barro (1990) Upon closer inspec-tion however this link is murky Stockreturns generally do not have substantial in-sample predictive content for future outputeven in bivariate regressions with no laggeddependent variables (Fama 1981 and Harvey1989) and any predictive content is reducedby including lagged output growth Thisminimal marginal predictive content is foundboth in linear regressions predicting outputgrowth (Stock and Watson 1989 1999a) andin probit regressions of binary recessionevents (Estrella and Mishkin 1998)

In his review article Campbell (1999)shows that in a simple loglinear representa-tive agent model the log price-dividend ratioembodies rational discounted forecasts ofdividend growth rates and stock returnsmaking it an appropriate state variable to usefor forecasting But in his internationaldataset (fifteen countries sample periodsmainly 1970s to 1996 Campbell (1999)found that the log dividend price ratio has lit-tle predictive content for output This is con-sistent with the generally negative conclu-sions in the larger literature that examinesthe predictive content of stock returnsdirectly These generally negative findingsprovide a precise reprise of the witticism thatthe stock market has predicted nine of thelast four recessions

Campbell et al (2001) proposed an inter-esting variant in which the variance of stockreturns rather than the returns themselvescould have predictive content for outputgrowth Using in-sample statistics theyfound evidence that high volatility in onequarter signals low growth in the next quar-ter as it might if high volatility was associ-ated with increased doubts about short-termeconomic prospects When Hui Guo (2002)used out-of-sample statistics however the

798 Journal of Economic Literature Vol XLI (September 2003)

evidence for predictive content was substan-tially weaker These findings are consistentwith the predictive content of stock marketvolatility being stronger during someepisodes than during others

Few studies have examined the predictivecontent of stock prices for inflation One isCharles Goodhart and Boris Hofmann(2000a) who found that stock returns do nothave marginal predictive content for infla-tion in their international data set (twelvedeveloped economies quarterly data mainly1970ndash98 or shorter)

Other financial indicators Exchangerates are a channel through which inflationcan be imported in open economies In theUnited States exchange rates (or a measureof the terms of trade) have long entered con-ventional Phillips curves Robert Gordon(1982 1998) finds these exchange rates sta-tistically significant based on in-sample testsIn their international dataset howeverGoodhart and Hofmann (2000b) find thatpseudo out-of-sample forecasts of inflationusing exchange rates and lagged inflationoutperformed autoregressive forecasts inonly one or two of their seventeen countriesdepending on the horizon At least in theUS data there is also little evidence thatexchange rates predict output growth (egStock and Watson 1999a)

One problem with the nominal termstructure as a predictor of inflation is thatunder the expectations hypothesis the for-ward rate embodies forecasts of both infla-tion and future real rates In theory one caneliminate the expected future real rates byusing spreads between forward rates in theterm structures of nominal and real debt ofmatched maturity and matched bearer riskIn practice one of the few cases for whichthis is possible with time series of a reason-able length is for British index-linked bondsDavid Barr and Campbell (1997) investi-gated the (bivariate in-sample) predictivecontent of these implicit inflation expecta-tions and found that they had better predic-tive content for inflation than forward rates

obtained solely from the nominal term struc-ture They provided no evidence on Grangercausality or marginal predictive content ofthese implicit inflation expectations in multi-variate regressions

Lettau and Sydney Ludvigson (2001) pro-posed a novel indicator the log of the consumption-wealth ratio They argue thatin a representative consumer model with nostickiness in consumption the log ratio ofconsumption to total wealth (human andnonhuman) should predict the return on themarket portfolio They find empirically thattheir version of the consumptionndashwealthratio (a cointegrating residual between con-sumption of nondurables financial wealthand labor income all in logarithms) has pre-dictive content for multiyear stock returns(both real returns and excess returns) Ifconsumption is sticky it could also have pre-dictive content for consumption growthHowever Ludvigson and Charles Steindel(1999) and Lettau and Ludvigson (2000)find that this indicator does not predict con-sumption growth or income growth in theUnited States one quarter ahead

Housing constitutes a large component ofaggregate wealth and gets significant weightin the CPI in many countries More gener-ally housing is a volatile and cyclically sensi-tive sector and measures of real activity inthe housing sector are known to be usefulleading indicators of economic activity atleast in the United States (Stock and Watson1989 1999a) suggesting a broader channelby which housing prices might forecast realactivity inflation or both In the UnitedStates housing starts (a real quantity mea-sure) have some predictive content for infla-tion (Stock 1998 Stock and Watson 1999b)Studies of the predictive content of housingprices confront difficult data problems how-ever Goodhart and Hofmann (2000a) con-structed a housing price data set for twelveOECD countries (extended to seventeencountries in Goodhart and Hofmann(2000b) They found that residential housinginflation has significant in-sample marginal

Stock and Watson Forecasting Output and Inflation 799

predictive content for overall inflation in afew of the several countries they studyalthough in several countries they usedinterpolated annual data which makes thoseresults difficult to assess

Nonlinear models The foregoing discus-sion has focused on statistical models inwhich the forecasts are linear functions of thepredictors even the recession predictionmodels estimated using probit regressionsare essentially linear in the sense that the pre-dictive index (the argument of the probitfunction) is a linear function of the predic-tors Might the problems of instability andepisodically poor predictive content stemfrom inherent nonlinearities in the forecast-ing relation The evidence on this proposi-tion is limited and mixed Ted Jaditz LeighRiddick and Chera Sayers (1998) examinelinear and nonlinear models of US indus-trial production using asset price predictorsand conclude that combined nonlinear fore-casts improve upon simple linear modelsadditionally Greg Tkacz (2001) reportsimprovements of nonlinear models over lin-ear models for forecasting Canadian GDPOn the other hand John Galbraith and Tkacz(2000) find limited international evidence ofnonlinearity in the outputndashterm spread rela-tion Similarly in their pseudo out-of-samplecomparison of VAR to multivariate neuralnetwork forecasts Norman Swanson andHalbert White (1997) concluded that the lin-ear forecasts generally performed better forvarious measures of US economic activityand inflation Swanson and White (1995)reached similar conclusions when forecastinginterest rates Given this limited evidence wecannot rule out the possibility that the rightnonlinear model will produce stable and reli-able forecasts of output and inflation usinginterest rates but this ldquorightrdquo nonlinearmodel has yet to be found

32 Forecasts Using Nonfinancial Variables

The literature on forecasting output andinflation with nonfinancial variables is mas-

sive see Stock and Watson (1999a) for anextensive review of the US evidence Thissection highlights a few recent studies onthis topic

The use of nonfinancial variables to forecastinflation has to a large extent focused onidentifying suitable measures of output gapsthat is estimating generalized Phillips curvesIn the United States the unemployment-based Phillips curve with a constant non-accelerating inflation rate of unemployment(NAIRU) has recently been unstable predict-ing accelerating inflation during a time thatinflation was in fact low and steady or fallingThis instability has been widely documentedsee for example Gordon (1997 1998) andDouglas Staiger Stock and Watson (1997ab2001) One reaction to this instability has beento suggest that the NAIRU was falling in theUnited States during the 1990s Mechanicallythis keeps the unemployment-based Phillipscurve on track and it makes sense in the con-text of changes in the US labor market and inthe economy generally cf Lawrence Katzand Alan Krueger (1999) However an impre-cisely estimated time-varying NAIRU makesforecasting using the unemployment-basedPhillips curve problematic

A different reaction to this time variation inthe NAIRU has been to see if there are alter-native predictive relations that have beenmore stable Staiger Stock and Watson(1997a) considered 71 candidate leadingindicators of inflation both financial and non-financial (quarterly US) and in a similar butmore thorough exercise Stock and Watson(1999b) considered 167 candidate leadingindicators (monthly US) They found a fewindicators that have been stable predictors ofinflation the prime example being the capac-ity utilization rate Gordon (1998) and Stock(1998) confirmed the accuracy of recent USinflation forecasts based on the capacity uti-lization rate Stock and Watson (1999b) alsosuggested an alternative Phillips-curve-typeforecast based on a single aggregate activityindex computed using 85 individual measuresof real aggregate activity

800 Journal of Economic Literature Vol XLI (September 2003)

Recently Andrew Atkeson and LeeOhanian (2001) challenged the usefulness ofall inflation forecasts based on the Phillipscurve and its variants They showed that forthe United States from 1984 to 2001 pub-lished inflation forecasts and pseudo out-of-sample Phillips curve forecasts did not beata seasonal random walk forecast of annualinflation Their finding poses a significantchallenge to all attempts to forecast inflationand we return to it in our empirical analysis

The international evidence on the suitabil-ity of output gaps and the Phillips Curve forforecasting inflation is mixed Simple unem-ployment-based models with a constantNAIRU fail in Europe which is one way tostate the phenomenon of so-called hysteresisin the unemployment rate More sophisti-cated and flexible statistical tools for esti-mating the NAIRU can improve in-samplefits for the European data (Thomas Laubach2001) but their value for forecasting is ques-tionable because of imprecision in the esti-mated NAIRU at the end of the sampleSimilarly inflation forecasts based on an out-put gap rather than the unemployment rateface the practical problem of estimating thegap at the end of the sample which neces-sarily introduces a one-sided estimate andassociated imprecision Evidence inMassimiliano Marcellino Stock and Watson(2000) suggests that the ability of output gapmodels to forecast inflation in Europe ismore limited than in the United States

Finally there is evidence (from US data)that the inflation process itself as well as pre-dictive relations based on it is time-varyingWilliam Brainard and George Perry (2000)and Timothy Cogley and Thomas Sargent(2001 2002) suggested that the persistence ofUS inflation was high in the 1970s and 1980sbut subsequently declined although this con-clusion appears to be sensitive to the methodused to measure persistence (FredericPivetta and Ricardo Reis 2002) GeorgeAkerlof William Dickens and Perry (2000)provided a model based on near-rationalbehavior that motivates a nonlinear Phillips

curve which they interpreted as consistentwith the Brainard and Perry (2000) evidence

In a similar vein Stephen Cecchetti RitaChu and Charles Steindel (2000) performeda pseudo out-of-sample forecasting experi-ment on various candidate leading indicatorsof inflation from 1985 to 1998 in the UnitedStates including interest rates term anddefault spreads and several nonfinancialindicators They concluded that none ofthese indicators financial or nonfinancialreliably predicts inflation in bivariate fore-casting models and that there are very fewyears in which financial variables outperforma simple autoregression Because theyassessed performance on a year-by-yearbasis these findings have great samplingvariability and it is difficult to know howmuch of this is due to true instability Stilltheir findings are consistent with Stock andWatsonrsquos (1996) results based on formal sta-bility tests that time variation in thesereduced form bivariate predictive relationsis widespread in the US data

33 Discussion

An econometrician might quibble withsome aspects of this literature Many ofthese studies fail to include lagged endoge-nous variables and thus do not assess mar-ginal predictive content Results oftenchange when marginal predictive content isconsidered (the predictive content of theterm spread for inflation is an example)Many of the regressions involve overlappingreturns and when the overlap period is largerelative to the sample size the distribution ofin-sample t-statistics and R2s becomes non-standard Some regressors such as the divi-dend yield and the term spread are highlypersistent and even if they do not have aunit root this persistence causes conventionalinference methods to break down These latter two problems combined make it evenmore difficult to do reliable inference andvery few of these studies mention far lesstackle either of these difficulties with their

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 3: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

790 Journal of Economic Literature Vol XLI (September 2003)

groups in-sample and out-of-samplemethods

21 In-Sample Measures of Predictive Content

Suppose we want to know whether a can-didate variable X is useful for forecasting avariable of interest Y For example Xt couldbe the value of the term spread in quarter tand Yt 1 1 could be the growth rate of realGDP in the next quarter so Yt 1 1 5400ln(GDPt 1 1GDPt) 5 400Dln(GDPt 1 1)(the factor of 400 standardizes the units toannual percentage growth rates) A simpleframework for assessing predictive content isthe linear regression model relating thefuture value of Y to the current value of X

Yt 1 1 5 b0 1 b1Xt 1 ut 1 1 (1)

where b0 and b1 are unknown parametersand ut 1 1 is an error term If b1 0 thentodayrsquos value of X can be used to forecast thevalue of Y in the next period The nullhypothesis that Xt has no predictive contentcan be tested by computing the t-statistic onb1 The economic significance of Xt as a pre-dictor can be assessed using the regressionR2 and the standard error of the regression(SER) the estimate of the standard deviationof ut 1 1 Because the error term can be heteroskedastic (that is the variance of ut 1 1can depend on Xt) andor autocorrelated (ut 1 1 can be correlated with its previous val-ues) the t-statistic should be computed us-ing heteroskedasticity- and autocorrelation-consistent (HAC) standard errors

This simple framework has an importantlimitation if Yt 1 1 is serially correlated as istypically the case for time series variables itsown past values are themselves useful pre-dictors Thus a more discerning questionthan that studied using (1) is whether Xt haspredictive content for Yt 1 1 above andbeyond that contained in its past valuesMoreover additional lagged values of Xt alsomight be useful predictors This leads to theextension of (1) in which multiple lagged

values of Xt and Yt appear This multipleregression model is conventionally ex-pressed using lag polynomials Let b1(L) andb2(L) denote lag polynomials so thatb 1 ( L ) X t 5 b 1 1 X t 1 b 1 2 X t 21 1 hellip 1b1pXt 2 p 1 1 where p is the numberof lagged values of X included (we refer to Xtas a lagged value because it is lagged relativeto the variable to be forecasted which isdated t 1 1 in (1)) Then the extendedregression model is the autoregressive dis-tributed lag (ADL) model

Yt 1 1 5 b0 1 b1(L)Xt 1 b2(L)Yt 1 ut 1 1 (2)

In the context of the ADL model (2) thehypothesis that Xt has no predictive contentfor Yt 1 1 above and beyond that in lags of Ycorresponds to the hypothesis that b1(L) 50 that is that each of the lag polynomialcoefficients equals zero This hypothesis canbe tested using the (heteroskedasticity-robust) F-statistic This F-statistic is com-monly called the Granger causality test sta-tistic The economic value of the additionalforecasting content of Xt can be assessed bycomputing the partial R2 of the regression orby computing the ratio of the SER (or itssquare) of the regression (2) to that of a uni-variate autoregression (AR) which is (2) inwhich Xt and its lags are excluded

Equation (2) applies to forecasts one period ahead but it is readily modified formultistep-ahead forecasts by replacing Yt 1 1with the suitable h-period ahead value Forexample if the variable being forecast is the percentage growth of real GDP over the next eight quarters then the depen-dent variable in (2) becomes 550ln(GDPt 1 8GDPt) where the factor of50 standardizes the units to be annual per-centage growth rates In general the h-stepahead forecasting regression can be written

5 b0 + b1(L)Xt + b2(L)Yt + (3)

Because the data are overlapping theerror term in (3) is serially correlatedso the test of predictive content based on (3)

u ht 1h

u ht 1hY h

t 1 h

Y 8t 1 8

Stock and Watson Forecasting Output and Inflation 791

2 An alternative to the ldquoh-step ahead projectionrdquoapproach in (3) is to estimate a vector autoregression(VAR) or some other joint one-step ahead model for Xt andYt then iterate this model forward for h periods Almost allthe papers in the asset price-as-predictor literature use theh-step ahead projection method If the VAR is correctlyspecified then the VAR iterated forecasts are more effi-cient asymptotically but the h-step ahead projection fore-cast reduces the potential impact of specification error inthe one-step ahead model

3 Many macroeconomic time series are subject to datarevisions An additional step towards forecasting reality isto construct pseudo out-of-sample forecasts using real-time datandashin the example the vintage of the data current-ly available as of 1989IV Implementing this in practicerequires using large data sets with many vintages Suchdata are now available (Dean Croushore and Tom Stark2003) The data revision issue is however less importantin the literature of concern in this review than elsewherea virtue of asset price data is that they are measured withnegligible error in real time and are not revised nor is theCPI revised One question is whether these asset pricesshould predict early or late (ldquofinalrdquo) vintages of GDP Theimplicit view in this literature is that the best way to eval-uate a true predictive relation is to use the best (final) esti-mate of GDP and we adopt that view here

producing a sequence of pseudo out-of-sample forecasts The model estimationstage can be complex possibly entailing theestimation of a large number of models andselecting among them based on some crite-rion for example the lag length of anautoregression could be selected using aninformation criterion such as the Akaike orBayes information criteria (AIC or BIC)Critically however all model selection andestimation must be done using data availableprior to making the forecastmdashin the exam-ple using only the data available through1989IV Pseudo out-of-sample measures offorecast accuracy have several desirablecharacteristics most notably from the per-spective of this survey being their ability todetect changes in parameters towards theend of the sample3

A common way to quantify pseudo out-of-sample forecast performance is to com-pute the mean squared forecast error of acandidate forecast (forecast i) relative to abenchmark (forecast 0) For example thecandidate forecast could be based on anasset price and the benchmark could comefrom a univariate autoregression Let

and be the benchmark andith candidate pseudo out-of-sample fore-casts of made using data throughtime t Then the h-step ahead meansquared forecast error (MSFE) of forecasti relative to that of the benchmark fore-cast is

Yht 1 h

Yhit 1 h tYh

0 t 1 h t

(the test of b1(L) 5 0) should be computedusing HAC standard errors2

The stability of the coefficients in the forecasting relation (3) can be assessed by avariety of methods including testing forbreaks in coefficients and estimation of mod-els with time-varying parameters Thesemethods are used infrequently in this litera-ture so we do not discuss them here Wereturn to in-sample tests for parameter stability when we describe our empiricalmethods in section 4

The tools discussed so far examine howuseful X would have been for predicting Yhad you been able to use the regression coef-ficients estimated by the full-sample regres-sion If coefficients change over time thisfull-sample analysis can be misleading forout-of-sample forecasting Therefore evalu-ations of predictive content also should relyon statistics that are designed to simulatemore closely actual real-time forecastingwhich we refer to generally as pseudo out-of-sample forecast evaluation

22 Pseudo Out-of-Sample Measures of Predictive Content

Pseudo out-of-sample measures of predic-tive content entail simulating real-time fore-casting Suppose the researcher has quarter-ly data to make the pseudo-forecast for1990I she estimates the model using dataavailable through 1989IV then uses thisestimated model to produce the 1990I fore-cast just as she would were it truly 1989IVThis is repeated throughout the samplemoving ahead one quarter at a time thereby

792 Journal of Economic Literature Vol XLI (September 2003)

(4)

where T1 and T2 ndash h are respectively the firstand last dates over which the pseudo out-of-sample forecast is computed (so that fore-casts are made for dates t 5 T1 1 hhellip T2)

If its relative MSFE is less than one thecandidate forecast is estimated to have per-formed better than the benchmark Ofcourse this could happen simply because ofsampling variability so to determinewhether a relative MSFE less than one is sta-tistically significant requires testing thehypothesis that the population relativeMSFE 5 1 against the alternative that it isless than one When neither model has anyestimated parameters this is done using themethods of Francis Diebold and RobertMariano (1995) Kenneth West (1996) treatsthe case that at least one model has esti-mated parameters and the models are notnested (that is the benchmark model is nota special case of model i) If as is the case inmuch of the literature we review the bench-mark is nested within model i then themethods developed by Michael McCracken(1999) and Todd Clark and McCracken(2001) apply These econometric methodshave been developed only recently so arealmost entirely absent from the literaturereviewed in section 3

In the empirical analysis in sections 6 and7 we use the relative mean squared forecasterror criterion in (4) because of its familiar-ity and ease of interpretation Many varia-tions on this method are available howeverAn alternative to the recursive estimationscheme outlined above is to use a fixed num-ber of observations (a rolling window) forestimating the forecasting model Alsosquared error loss can be replaced with a dif-ferent loss function for example one couldcompute relative mean absolute error Other

1T2 2 T1 2 h 1 1 S

T22h

t5T1

1Yht1h 2 Yh

it1h t22

1T2 2 T1 2 h 1 1 S

T22h

t5T1

1Yht1h 2 Yh

0 t1h t22

statistics such as statistics that test for fore-cast encompassing or ones that assess theaccuracy of the forecasted direction ofchange can be used in addition to a relativeerror measure (see M Hashem Pesaran andSpyros Skouras 2002 and McCracken andWest 2002 for a discussion of alternativeforecast evaluation statistics) For a textbookintroduction and worked empirical examplesof pseudo out-of-sample forecast compar-ison see Stock and Watson (2003a section126) For an introduction to the recent literature on forecast comparisons seeMcCracken and West (2002)

3 Literature Survey

This survey first reviews papers that useasset prices as predictors of inflation andoroutput growth then provides a brief selec-tive summary of recent developments usingnonfinancial indicators Although we men-tion some historical precedents our reviewfocuses on developments within the past fif-teen years The section concludes with anattempt to draw some general conclusionsfrom this literature

31 Forecasts Using Asset Prices

Interest rates Short-term interest rateshave a long history of use as predictors of out-put and inflation Notably using data for theUnited States Christopher Sims (1980) foundthat including the commercial paper rate invector autoregressions (VARs) with outputinflation and money eliminated the marginalpredictive content of money for real outputThis result has been confirmed in numerousstudies eg Ben Bernanke and Alan Blinder(1992) for the United States who suggestedthat the federal funds rate is the appropriateshort-run measure of monetary policy ratherthan the growth of monetary aggregates Mostof the research involving interest rate spreadshowever has found that the level (or change)of a short rate has little marginal predictivecontent for output once spreads are included

Stock and Watson Forecasting Output and Inflation 793

The term spread and output growth Theterm spread is the difference between inter-est rates on long and short maturity debtusually government debt The literature onterm spreads uses different measures of thisspread the most common being a long gov-ernment bond rate minus a three-monthgovernment bill rate or instead the longbond rate minus an overnight rate (in theUnited States the federal funds rate)

The adage that an inverted yield curve sig-nals a recession was formalized empiricallyapparently independently by a number ofresearchers in the late 1980s includingRobert Laurent (1988 1989) CampbellHarvey (1988 1989) Stock and Watson(1989) Nai-Fu Chen (1991) and ArturoEstrella and Gikas Hardouvelis (1991) Thesestudies mainly focused on using the termspread to predict output growth (or in thecase of Harvey 1988 consumption growth)using US data Of these studies Estrella andHardouvelis (1991) provided the most com-prehensive documentation of the strong (in-sample) predictive content of the spread foroutput including its ability to predict a bi-nary recession indicator in probit regressionsThis early work focused on bivariate rela-tions with the exception of Stock and Watson(1989) who used in-sample statistics forbivariate and multivariate regressions toidentify the term spread and a default spread(the paper-bill spread discussed below) astwo historically potent leading indicators foroutput The work of Eugene Fama (1990)and Frederic Mishkin (1990ab) is alsonotable for they found that the term spreadhas (in-sample bivariate) predictive contentfor real rates especially at shorter horizons

Subsequent work focused on developingeconomic explanations for this relationdetermining whether it is stable over timewithin the United States and ascertainingwhether it holds up in international evidenceThe standard economic explanation for whythe term spread has predictive content foroutput is that the spread is an indicator of aneffective monetary policy monetary tighten-

ing results in short-term interest rates thatare high relative to long-term interest ratesand these high short rates in turn produce aneconomic slowdown (Bernanke and Blinder1992) Notably when placed within a multi-variate model the predictive content of theterm spread can change if monetary policychanges or the composition of economicshocks changes (Frank Smets and KostasTsatsaronis 1997) Movements in expectedfuture interest rates might not account for allthe predictive power of the term spreadhowever James Hamilton and Dong HeonKim (2002) suggested that the term premium(the term spread minus its predicted compo-nent under the expectations hypothesis of theterm structure of interest rates) has impor-tant predictive content for output as well

A closer examination of the US evidencehas led to the conclusion that the predictivecontent of the term spread for economicactivity has diminished since 1985 a pointmade using both pseudo out-of-sample androlling in-sample statistics by JosephHaubrich and Ann Dombrosky (1996) and byMichael Dotsey (1998) Similarly AndrewAng Monika Piazzesi and Min Wei (2003)found that during the 1990s US GDPgrowth is better predicted by the short ratethan the term spread In contrast researchthat focuses on predicting binary recessionevents (instead of output growth itself) sug-gests that the term spread might have hadsome link to the 1990 recession In particularthe ex post analyses of Estrella and Mishkin(1998) Kajal Lahiri and Jiazhou Wang (1996)and Michael Dueker (1997) respectively pro-vided probit and Markov switching modelsthat produce in-sample recession probabili-ties consistent with the term spread providingadvance warning of the 1990 US recessionThese estimated probabilities however werebased on estimated parameters that includethis recession so these are not real time orpseudo out-of-sample recession probabilities

The real-time evidence about the value ofthe spread as an indicator in the 1990 reces-sion is more mixed Laurent (1989) using the

794 Journal of Economic Literature Vol XLI (September 2003)

term spread predicted an imminent reces-sion in the United States Harvey (1989) pub-lished a forecast based on the yield curve thatsuggested ldquoa slowing of economic growth butnot zero or negative growthrdquo from the thirdquarter of 1989 through the third quarter of1990 and the Stock-Watson (1989) experi-mental recession index increased sharplywhen the yield curve flattened in late 1988and early 1989 However the business cyclepeak of July 1990 considerably postdates thepredicted period of these slowdowns asLaurent (1989) wrote ldquorecent spread datasuggest that the slowdown is likely to extendthrough the rest of 1989 and be quite signifi-cantrdquo Moreover Laurentrsquos (1989) forecastwas based in part on a judgmental interpreta-tion that the then-current inversion of theyield curve had special (nonlinear) signifi-cance signaling a downturn more severethan would be suggested by a linear modelIndeed even the largest predicted recessionprobabilities from the in-sample models aremodest 25 percent in Estrella and Mishkinrsquos(1998) probit model and 20 percent inDuekerrsquos (1997) Markov switching model forexample Harvey (1993) interpreted this evi-dence more favorably arguing that becausethe yield curve inverted moderately begin-ning in 1989II it correctly predicted a mod-erate recession six quarters later Our inter-pretation of this episode is that the termspread is an indicator of monetary policy thatmonetary policy was tight during late 1988and that yield-curve based models correctlypredicted a slowdown in 1989 This slow-down was not a recession however and theproximate cause of the recession of 1990 wasnot monetary tightening but rather specialnonmonetary circumstances in particular theinvasion of Kuwait by Iraq and the subse-quent response by US consumers (OlivierBlanchard 1993) This interpretation isbroadly similar to Benjamin Friedman andKenneth Kuttnerrsquos (1998) explanation of thefailure of the paper-bill spread to predict the1990 recession (discussed below)

Stock and Watson (2003b) examine the

behavior of various leading indicators beforeand during the US recession that began inMarch 2001 The term spread did turn neg-ative in advance of this recession the Fedfunds rate exceeded the long bond rate fromJune 2000 through March 2001 This inver-sion however was small by historical stan-dards While regressions of the form (3) pre-dicted a slower rate of economic growth inearly 2001 the predicted slowdown wasmodest four quarter growth forecasts basedon (3) fell 14 percentage points from 33percent in 2000I to a minimum of 19 per-cent in 2000IV still far from the negativegrowth of a recession

One way to get additional evidence on thereliability of the term spread as a predictor ofoutput growth is to examine evidence forother countries Harvey (1991) Zuliu Hu(1993) E Philip Davis and S G B Henry(1994) Charles Plosser and K GeertRouwenhorst (1994) Catherine Bonser-Nealand Timothy Morley (1997) Sharon Kozicki(1997) John Campbell (1999) Estrella andMishkin (1997) Estrella Anthony Rodriguesand Sebastian Schich (2003) and JosephAtta-Mensah and Greg Tkacz (2001) gener-ally conclude that the term spread has pre-dictive content for real output growth inmajor non-US OECD economies EstrellaRodrigues and Schich (2003) use in-samplebreak tests to assess coefficient stability of theforecasting relations and typically fail to rejectthe null hypothesis of stability in the cases inwhich the term spread has the greatest esti-mated predictive content (mainly long-horizon regressions) Additionally HenriBernard and Stefan Gerlach (1998) andEstrella Rodrigues and Schich (2003) pro-vide cross-country evidence on term spreadsas predictors of a binary recession indicatorfor seven non-US OECD countries Unlikemost of these papers Plosser andRouwenhorst (1994) considered multipleregressions that include the level and changeof interest rates and concluded that given thespread the short rate has little predictive con-tent for output in almost all the economies

Stock and Watson Forecasting Output and Inflation 795

they consider These studies typically used in-sample statistics and data sets that start in1970 or later Three exceptions to this gener-ally sanguine view are Davis and GabrielFagan (1997) Smets and Tsatsaronis (1997)and Fabio Canova and Gianni De Nicolo(2000) Using a pseudo out-of-sample fore-casting design Davis and Fagan (1997) findevidence of subsample instability and reportdisappointing pseudo out-of-sample forecast-ing performance across nine EU economiesSmets and Tsatsaronis (1997) find instabilityin the yield curvendashoutput relation in the1990s in the United States and GermanyCanova and De Nicolo (2000) using in-sample VAR statistics find only a limitedforecasting role for innovations to the termpremium in Germany Japan and theUnited Kingdom

Term spreads and inflation Many stud-ies including some of those already citedalso consider the predictive content of theterm spread for inflation According to theexpectations hypothesis of the term struc-ture of interest rates the forward rate (andthe term spread) should embody marketexpectations of future inflation and thefuture real rate With some notable excep-tions the papers in this literature generallyfind that there is little or no marginal infor-mation content in the nominal interest rateterm structure for future inflation Much ofthe early work which typically claims to findpredictive content did not control for laggedinflation In US data Mishkin (1990a)found no predictive content of term spreadsfor inflation at the short end of the yieldcurve although Mishkin (1990b) found pre-dictive content using spreads that involvelong bond rates Philippe Jorion and Mishkin(1991) and Mishkin (1991) reached similarconclusions using data on ten OECD coun-tries results confirmed by Gerlach (1997)for Germany using Mishkinrsquos methodologyDrawing on Jeffrey Frankelrsquos (1982) earlywork in this area Frankel and Cara Lown(1994) suggested a modification of the termspread based on a weighted average of dif-

ferent maturities that outperformed the sim-ple term spread in Mishkin-style regressions

Mishkinrsquos regressions have a single sto-chastic regressor the term spread (no lags)and in particular do not include lagged infla-tion Inflation is highly persistent howeverand Bernanke and Mishkin (1992) Estrellaand Mishkin (1997) and Kozicki (1997)examined the in-sample marginal predictivecontent of the term spread given laggedinflation Bernanke and Mishkin (1992)found little or no marginal predictive con-tent of the term spread for one-month-ahead inflation in a data set with six largeeconomies once lags of inflation are includ-ed Kozicki (1997) and Estrella and Mishkin(1997) included only a single lag of inflationbut even so they found that doing so sub-stantially reduced the marginal predictivecontent of the term spread for future infla-tion over one to two years For exampleonce lagged inflation is added Kozicki(1997) found that the spread remained sig-nificant for one-year inflation in only two ofthe ten OECD countries she studied inEstrella and Mishkinrsquos (1997) study the termspread was no longer a significant predictorat the one-year horizon in any of their fourcountries although they provided evidencefor predictive content at longer horizons

Default spreads Another strand ofresearch has focused on the predictive con-tent of default spreads primarily for realeconomic activity A default spread is the dif-ference between the interest rates onmatched maturity private debt with differentdegrees of default risk Different authorsmeasure this spread differently and thesedifferences are potentially importantBecause markets for private debt differ sub-stantially across countries and are mostdeveloped for the United States this workhas focused on the United States

In his study of the credit channel duringthe Great Depression Bernanke (1983)showed that during the interwar period theBaa-Treasury bond spread was a useful pre-dictor of industrial production growth Stock

796 Journal of Economic Literature Vol XLI (September 2003)

and Watson (1989) and Friedman andKuttner (1992) studied the default spread asa predictor of real growth in the postwarperiod they found that the spread betweencommercial paper and US Treasury bills ofthe same maturity (three or six months theldquopaper-billrdquo spread) was a potent predictorof output growth (monthly data 1959ndash88 forStock and Watson 1989 quarterly data1960ndash90 for Friedman and Kuttner 1992)Using in-sample statistics Friedman andKuttner (1992) concluded that upon con-trolling for the paper-bill spread monetaryaggregates and interest rates have little predictive content for real output a findingconfirmed by Bernanke and Blinder (1992)and Martin Feldstein and Stock (1994)

Subsequent literature focused on whetherthis predictive relationship is stable overtime Bernanke (1990) used in-sample statis-tics to confirm the strong performance ofpaper-bill spread as predictor of output butby splitting up the sample he also suggestedthat this relation weakened during the1980s This view was affirmed and assertedmore strongly by Mark Thoma and Jo AnnaGray (1994) Rik Hafer and Ali Kutan(1992) and Kenneth Emery (1996) Thomaand Gray (1994) for example found that thepaper-bill spread has strong in-sampleexplanatory power in recursive or rollingregressions but little predictive power inpseudo out-of-sample forecasting exercisesover the 1980s Emery (1996) finds little in-sample explanatory power of the paper-billspread in samples that postdate 1980 Theseauthors interpreted this as a consequence ofspecial events especially in 1973ndash74 whichcontribute to a good in-sample fit but notnecessarily good forecasting performanceDrawing on institutional considerationsJohn Duca (1999) also took this view Ducarsquos(1999) concerns echo Timothy Cookrsquos (1981)warnings about how the changing institu-tional environment and financial innovationscould substantially change markets for short-term debt and thereby alter the relationshipbetween default spreads and real activity

One obvious true out-of-sample predic-tive failure of the paper-bill spread is its fail-ure to rise sharply in advance of the 1990ndash91US recession In their post-mortemFriedman and Kuttner (1998) suggested thatthis predictive failure arose because the1990ndash91 recession was caused in large partby nonmonetary events that would not havebeen detected by the paper-bill spread Theyfurther argued that there were changes inthe commercial paper market unrelated tothe recession that also led to this predictivefailure Similarly the paper-bill spread failedto forecast the 2001 recession the paper-billspread had brief moderate spikes in October1998 October 1999 and June 2000 but thespread was small and declining from August2000 through the end of 2001 (Stock andWatson 2003b)

We know of little work examining the pre-dictive content of default spreads ineconomies other than the United StatesBernanke and Mishkin (1992) report a pre-liminary investigation but they questionedthe adequacy of their private debt interestrate data (the counterpart of the commercialpaper rate in the United States) for severalcountries Finding sufficiently long timeseries data on reliable market prices of suit-able private debt instruments has been abarrier to international comparisons on therole of the default spread

Some studies examined the predictivecontent of the default spread for inflationFriedman and Kuttner (1992) found littlepredictive content of the paper-bill spreadfor inflation using Granger causality testsConsistent with this Feldstein and Stock(1994) found that although the paper-billspread was a significant in-sample predictorof real GDP it did not significantly enterequations predicting nominal GDP

Four nonexclusive arguments have beenput forth on why the paperndashbill spread hadpredictive content for output growth duringthe 1960s and 1970s Stock and Watson(1989) suggested the predictive contentarises from expectations of default risk

Stock and Watson Forecasting Output and Inflation 797

which are in turn based on private expecta-tions of sales and profits Bernanke (1990)and Bernanke and Blinder (1992) arguedinstead that the paper-bill spread is a sensi-tive measure of monetary policy and this isthe main source of its predictive contentFriedman and Kuttner (1993ab) suggestedthat the spread is detecting influences ofsupply and demand (ie liquidity) in themarket for private debt this emphasis is sim-ilar to Cookrsquos (1981) attribution of move-ments in such spreads to supply and demandconsiderations Finally Thoma and Gray(1994) and Emery (1996) argued that thepredictive content is largely coincidental theconsequence of one-time events

There has been some examination of otherspreads in this literature Mark Gertler andLown (2000) take the view that because ofthe credit channel theory of monetary policytransmission the premise of using a defaultspread to predict future output is sound butthat the paper-bill spread is a flawed choicefor institutional reasons Instead they sug-gest using the high-yield bond (ldquojunkbondrdquo)ndashAaa spread instead The junk bondmarket was only developed in the 1980s inthe United States so this spread has a shorttime series Still Gertler and Lown (2000)present in-sample evidence that its explana-tory power was strong throughout this pe-riod This is notable because the paper-billspread (and as was noted above the termspread) have substantially reduced or no pre-dictive content for output growth in theUnited States during this period HoweverDucarsquos (1999) concerns about default spreadsin general extend to the junk bond-Aaaspread as well he suggests the spike in thejunk bond spread in the late 1980s and early1990s (which is key to this spreadrsquos signal ofthe 1990 recession) was a coincidental conse-quence of the aftermath of the thrift crisis inwhich thrifts were forced to sell their junkbond holdings in an illiquid market

Stock prices and dividend yields If theprice of a stock equals the expected dis-counted value of future earnings then stock

returns should be useful in forecasting earn-ings growth or more broadly output growthThe empirical link between stock prices andeconomic activity has been noted at leastsince Mitchell and Burns (1938) see StanleyFischer and Robert Merton (1984) andRobert Barro (1990) Upon closer inspec-tion however this link is murky Stockreturns generally do not have substantial in-sample predictive content for future outputeven in bivariate regressions with no laggeddependent variables (Fama 1981 and Harvey1989) and any predictive content is reducedby including lagged output growth Thisminimal marginal predictive content is foundboth in linear regressions predicting outputgrowth (Stock and Watson 1989 1999a) andin probit regressions of binary recessionevents (Estrella and Mishkin 1998)

In his review article Campbell (1999)shows that in a simple loglinear representa-tive agent model the log price-dividend ratioembodies rational discounted forecasts ofdividend growth rates and stock returnsmaking it an appropriate state variable to usefor forecasting But in his internationaldataset (fifteen countries sample periodsmainly 1970s to 1996 Campbell (1999)found that the log dividend price ratio has lit-tle predictive content for output This is con-sistent with the generally negative conclu-sions in the larger literature that examinesthe predictive content of stock returnsdirectly These generally negative findingsprovide a precise reprise of the witticism thatthe stock market has predicted nine of thelast four recessions

Campbell et al (2001) proposed an inter-esting variant in which the variance of stockreturns rather than the returns themselvescould have predictive content for outputgrowth Using in-sample statistics theyfound evidence that high volatility in onequarter signals low growth in the next quar-ter as it might if high volatility was associ-ated with increased doubts about short-termeconomic prospects When Hui Guo (2002)used out-of-sample statistics however the

798 Journal of Economic Literature Vol XLI (September 2003)

evidence for predictive content was substan-tially weaker These findings are consistentwith the predictive content of stock marketvolatility being stronger during someepisodes than during others

Few studies have examined the predictivecontent of stock prices for inflation One isCharles Goodhart and Boris Hofmann(2000a) who found that stock returns do nothave marginal predictive content for infla-tion in their international data set (twelvedeveloped economies quarterly data mainly1970ndash98 or shorter)

Other financial indicators Exchangerates are a channel through which inflationcan be imported in open economies In theUnited States exchange rates (or a measureof the terms of trade) have long entered con-ventional Phillips curves Robert Gordon(1982 1998) finds these exchange rates sta-tistically significant based on in-sample testsIn their international dataset howeverGoodhart and Hofmann (2000b) find thatpseudo out-of-sample forecasts of inflationusing exchange rates and lagged inflationoutperformed autoregressive forecasts inonly one or two of their seventeen countriesdepending on the horizon At least in theUS data there is also little evidence thatexchange rates predict output growth (egStock and Watson 1999a)

One problem with the nominal termstructure as a predictor of inflation is thatunder the expectations hypothesis the for-ward rate embodies forecasts of both infla-tion and future real rates In theory one caneliminate the expected future real rates byusing spreads between forward rates in theterm structures of nominal and real debt ofmatched maturity and matched bearer riskIn practice one of the few cases for whichthis is possible with time series of a reason-able length is for British index-linked bondsDavid Barr and Campbell (1997) investi-gated the (bivariate in-sample) predictivecontent of these implicit inflation expecta-tions and found that they had better predic-tive content for inflation than forward rates

obtained solely from the nominal term struc-ture They provided no evidence on Grangercausality or marginal predictive content ofthese implicit inflation expectations in multi-variate regressions

Lettau and Sydney Ludvigson (2001) pro-posed a novel indicator the log of the consumption-wealth ratio They argue thatin a representative consumer model with nostickiness in consumption the log ratio ofconsumption to total wealth (human andnonhuman) should predict the return on themarket portfolio They find empirically thattheir version of the consumptionndashwealthratio (a cointegrating residual between con-sumption of nondurables financial wealthand labor income all in logarithms) has pre-dictive content for multiyear stock returns(both real returns and excess returns) Ifconsumption is sticky it could also have pre-dictive content for consumption growthHowever Ludvigson and Charles Steindel(1999) and Lettau and Ludvigson (2000)find that this indicator does not predict con-sumption growth or income growth in theUnited States one quarter ahead

Housing constitutes a large component ofaggregate wealth and gets significant weightin the CPI in many countries More gener-ally housing is a volatile and cyclically sensi-tive sector and measures of real activity inthe housing sector are known to be usefulleading indicators of economic activity atleast in the United States (Stock and Watson1989 1999a) suggesting a broader channelby which housing prices might forecast realactivity inflation or both In the UnitedStates housing starts (a real quantity mea-sure) have some predictive content for infla-tion (Stock 1998 Stock and Watson 1999b)Studies of the predictive content of housingprices confront difficult data problems how-ever Goodhart and Hofmann (2000a) con-structed a housing price data set for twelveOECD countries (extended to seventeencountries in Goodhart and Hofmann(2000b) They found that residential housinginflation has significant in-sample marginal

Stock and Watson Forecasting Output and Inflation 799

predictive content for overall inflation in afew of the several countries they studyalthough in several countries they usedinterpolated annual data which makes thoseresults difficult to assess

Nonlinear models The foregoing discus-sion has focused on statistical models inwhich the forecasts are linear functions of thepredictors even the recession predictionmodels estimated using probit regressionsare essentially linear in the sense that the pre-dictive index (the argument of the probitfunction) is a linear function of the predic-tors Might the problems of instability andepisodically poor predictive content stemfrom inherent nonlinearities in the forecast-ing relation The evidence on this proposi-tion is limited and mixed Ted Jaditz LeighRiddick and Chera Sayers (1998) examinelinear and nonlinear models of US indus-trial production using asset price predictorsand conclude that combined nonlinear fore-casts improve upon simple linear modelsadditionally Greg Tkacz (2001) reportsimprovements of nonlinear models over lin-ear models for forecasting Canadian GDPOn the other hand John Galbraith and Tkacz(2000) find limited international evidence ofnonlinearity in the outputndashterm spread rela-tion Similarly in their pseudo out-of-samplecomparison of VAR to multivariate neuralnetwork forecasts Norman Swanson andHalbert White (1997) concluded that the lin-ear forecasts generally performed better forvarious measures of US economic activityand inflation Swanson and White (1995)reached similar conclusions when forecastinginterest rates Given this limited evidence wecannot rule out the possibility that the rightnonlinear model will produce stable and reli-able forecasts of output and inflation usinginterest rates but this ldquorightrdquo nonlinearmodel has yet to be found

32 Forecasts Using Nonfinancial Variables

The literature on forecasting output andinflation with nonfinancial variables is mas-

sive see Stock and Watson (1999a) for anextensive review of the US evidence Thissection highlights a few recent studies onthis topic

The use of nonfinancial variables to forecastinflation has to a large extent focused onidentifying suitable measures of output gapsthat is estimating generalized Phillips curvesIn the United States the unemployment-based Phillips curve with a constant non-accelerating inflation rate of unemployment(NAIRU) has recently been unstable predict-ing accelerating inflation during a time thatinflation was in fact low and steady or fallingThis instability has been widely documentedsee for example Gordon (1997 1998) andDouglas Staiger Stock and Watson (1997ab2001) One reaction to this instability has beento suggest that the NAIRU was falling in theUnited States during the 1990s Mechanicallythis keeps the unemployment-based Phillipscurve on track and it makes sense in the con-text of changes in the US labor market and inthe economy generally cf Lawrence Katzand Alan Krueger (1999) However an impre-cisely estimated time-varying NAIRU makesforecasting using the unemployment-basedPhillips curve problematic

A different reaction to this time variation inthe NAIRU has been to see if there are alter-native predictive relations that have beenmore stable Staiger Stock and Watson(1997a) considered 71 candidate leadingindicators of inflation both financial and non-financial (quarterly US) and in a similar butmore thorough exercise Stock and Watson(1999b) considered 167 candidate leadingindicators (monthly US) They found a fewindicators that have been stable predictors ofinflation the prime example being the capac-ity utilization rate Gordon (1998) and Stock(1998) confirmed the accuracy of recent USinflation forecasts based on the capacity uti-lization rate Stock and Watson (1999b) alsosuggested an alternative Phillips-curve-typeforecast based on a single aggregate activityindex computed using 85 individual measuresof real aggregate activity

800 Journal of Economic Literature Vol XLI (September 2003)

Recently Andrew Atkeson and LeeOhanian (2001) challenged the usefulness ofall inflation forecasts based on the Phillipscurve and its variants They showed that forthe United States from 1984 to 2001 pub-lished inflation forecasts and pseudo out-of-sample Phillips curve forecasts did not beata seasonal random walk forecast of annualinflation Their finding poses a significantchallenge to all attempts to forecast inflationand we return to it in our empirical analysis

The international evidence on the suitabil-ity of output gaps and the Phillips Curve forforecasting inflation is mixed Simple unem-ployment-based models with a constantNAIRU fail in Europe which is one way tostate the phenomenon of so-called hysteresisin the unemployment rate More sophisti-cated and flexible statistical tools for esti-mating the NAIRU can improve in-samplefits for the European data (Thomas Laubach2001) but their value for forecasting is ques-tionable because of imprecision in the esti-mated NAIRU at the end of the sampleSimilarly inflation forecasts based on an out-put gap rather than the unemployment rateface the practical problem of estimating thegap at the end of the sample which neces-sarily introduces a one-sided estimate andassociated imprecision Evidence inMassimiliano Marcellino Stock and Watson(2000) suggests that the ability of output gapmodels to forecast inflation in Europe ismore limited than in the United States

Finally there is evidence (from US data)that the inflation process itself as well as pre-dictive relations based on it is time-varyingWilliam Brainard and George Perry (2000)and Timothy Cogley and Thomas Sargent(2001 2002) suggested that the persistence ofUS inflation was high in the 1970s and 1980sbut subsequently declined although this con-clusion appears to be sensitive to the methodused to measure persistence (FredericPivetta and Ricardo Reis 2002) GeorgeAkerlof William Dickens and Perry (2000)provided a model based on near-rationalbehavior that motivates a nonlinear Phillips

curve which they interpreted as consistentwith the Brainard and Perry (2000) evidence

In a similar vein Stephen Cecchetti RitaChu and Charles Steindel (2000) performeda pseudo out-of-sample forecasting experi-ment on various candidate leading indicatorsof inflation from 1985 to 1998 in the UnitedStates including interest rates term anddefault spreads and several nonfinancialindicators They concluded that none ofthese indicators financial or nonfinancialreliably predicts inflation in bivariate fore-casting models and that there are very fewyears in which financial variables outperforma simple autoregression Because theyassessed performance on a year-by-yearbasis these findings have great samplingvariability and it is difficult to know howmuch of this is due to true instability Stilltheir findings are consistent with Stock andWatsonrsquos (1996) results based on formal sta-bility tests that time variation in thesereduced form bivariate predictive relationsis widespread in the US data

33 Discussion

An econometrician might quibble withsome aspects of this literature Many ofthese studies fail to include lagged endoge-nous variables and thus do not assess mar-ginal predictive content Results oftenchange when marginal predictive content isconsidered (the predictive content of theterm spread for inflation is an example)Many of the regressions involve overlappingreturns and when the overlap period is largerelative to the sample size the distribution ofin-sample t-statistics and R2s becomes non-standard Some regressors such as the divi-dend yield and the term spread are highlypersistent and even if they do not have aunit root this persistence causes conventionalinference methods to break down These latter two problems combined make it evenmore difficult to do reliable inference andvery few of these studies mention far lesstackle either of these difficulties with their

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 4: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 791

2 An alternative to the ldquoh-step ahead projectionrdquoapproach in (3) is to estimate a vector autoregression(VAR) or some other joint one-step ahead model for Xt andYt then iterate this model forward for h periods Almost allthe papers in the asset price-as-predictor literature use theh-step ahead projection method If the VAR is correctlyspecified then the VAR iterated forecasts are more effi-cient asymptotically but the h-step ahead projection fore-cast reduces the potential impact of specification error inthe one-step ahead model

3 Many macroeconomic time series are subject to datarevisions An additional step towards forecasting reality isto construct pseudo out-of-sample forecasts using real-time datandashin the example the vintage of the data current-ly available as of 1989IV Implementing this in practicerequires using large data sets with many vintages Suchdata are now available (Dean Croushore and Tom Stark2003) The data revision issue is however less importantin the literature of concern in this review than elsewherea virtue of asset price data is that they are measured withnegligible error in real time and are not revised nor is theCPI revised One question is whether these asset pricesshould predict early or late (ldquofinalrdquo) vintages of GDP Theimplicit view in this literature is that the best way to eval-uate a true predictive relation is to use the best (final) esti-mate of GDP and we adopt that view here

producing a sequence of pseudo out-of-sample forecasts The model estimationstage can be complex possibly entailing theestimation of a large number of models andselecting among them based on some crite-rion for example the lag length of anautoregression could be selected using aninformation criterion such as the Akaike orBayes information criteria (AIC or BIC)Critically however all model selection andestimation must be done using data availableprior to making the forecastmdashin the exam-ple using only the data available through1989IV Pseudo out-of-sample measures offorecast accuracy have several desirablecharacteristics most notably from the per-spective of this survey being their ability todetect changes in parameters towards theend of the sample3

A common way to quantify pseudo out-of-sample forecast performance is to com-pute the mean squared forecast error of acandidate forecast (forecast i) relative to abenchmark (forecast 0) For example thecandidate forecast could be based on anasset price and the benchmark could comefrom a univariate autoregression Let

and be the benchmark andith candidate pseudo out-of-sample fore-casts of made using data throughtime t Then the h-step ahead meansquared forecast error (MSFE) of forecasti relative to that of the benchmark fore-cast is

Yht 1 h

Yhit 1 h tYh

0 t 1 h t

(the test of b1(L) 5 0) should be computedusing HAC standard errors2

The stability of the coefficients in the forecasting relation (3) can be assessed by avariety of methods including testing forbreaks in coefficients and estimation of mod-els with time-varying parameters Thesemethods are used infrequently in this litera-ture so we do not discuss them here Wereturn to in-sample tests for parameter stability when we describe our empiricalmethods in section 4

The tools discussed so far examine howuseful X would have been for predicting Yhad you been able to use the regression coef-ficients estimated by the full-sample regres-sion If coefficients change over time thisfull-sample analysis can be misleading forout-of-sample forecasting Therefore evalu-ations of predictive content also should relyon statistics that are designed to simulatemore closely actual real-time forecastingwhich we refer to generally as pseudo out-of-sample forecast evaluation

22 Pseudo Out-of-Sample Measures of Predictive Content

Pseudo out-of-sample measures of predic-tive content entail simulating real-time fore-casting Suppose the researcher has quarter-ly data to make the pseudo-forecast for1990I she estimates the model using dataavailable through 1989IV then uses thisestimated model to produce the 1990I fore-cast just as she would were it truly 1989IVThis is repeated throughout the samplemoving ahead one quarter at a time thereby

792 Journal of Economic Literature Vol XLI (September 2003)

(4)

where T1 and T2 ndash h are respectively the firstand last dates over which the pseudo out-of-sample forecast is computed (so that fore-casts are made for dates t 5 T1 1 hhellip T2)

If its relative MSFE is less than one thecandidate forecast is estimated to have per-formed better than the benchmark Ofcourse this could happen simply because ofsampling variability so to determinewhether a relative MSFE less than one is sta-tistically significant requires testing thehypothesis that the population relativeMSFE 5 1 against the alternative that it isless than one When neither model has anyestimated parameters this is done using themethods of Francis Diebold and RobertMariano (1995) Kenneth West (1996) treatsthe case that at least one model has esti-mated parameters and the models are notnested (that is the benchmark model is nota special case of model i) If as is the case inmuch of the literature we review the bench-mark is nested within model i then themethods developed by Michael McCracken(1999) and Todd Clark and McCracken(2001) apply These econometric methodshave been developed only recently so arealmost entirely absent from the literaturereviewed in section 3

In the empirical analysis in sections 6 and7 we use the relative mean squared forecasterror criterion in (4) because of its familiar-ity and ease of interpretation Many varia-tions on this method are available howeverAn alternative to the recursive estimationscheme outlined above is to use a fixed num-ber of observations (a rolling window) forestimating the forecasting model Alsosquared error loss can be replaced with a dif-ferent loss function for example one couldcompute relative mean absolute error Other

1T2 2 T1 2 h 1 1 S

T22h

t5T1

1Yht1h 2 Yh

it1h t22

1T2 2 T1 2 h 1 1 S

T22h

t5T1

1Yht1h 2 Yh

0 t1h t22

statistics such as statistics that test for fore-cast encompassing or ones that assess theaccuracy of the forecasted direction ofchange can be used in addition to a relativeerror measure (see M Hashem Pesaran andSpyros Skouras 2002 and McCracken andWest 2002 for a discussion of alternativeforecast evaluation statistics) For a textbookintroduction and worked empirical examplesof pseudo out-of-sample forecast compar-ison see Stock and Watson (2003a section126) For an introduction to the recent literature on forecast comparisons seeMcCracken and West (2002)

3 Literature Survey

This survey first reviews papers that useasset prices as predictors of inflation andoroutput growth then provides a brief selec-tive summary of recent developments usingnonfinancial indicators Although we men-tion some historical precedents our reviewfocuses on developments within the past fif-teen years The section concludes with anattempt to draw some general conclusionsfrom this literature

31 Forecasts Using Asset Prices

Interest rates Short-term interest rateshave a long history of use as predictors of out-put and inflation Notably using data for theUnited States Christopher Sims (1980) foundthat including the commercial paper rate invector autoregressions (VARs) with outputinflation and money eliminated the marginalpredictive content of money for real outputThis result has been confirmed in numerousstudies eg Ben Bernanke and Alan Blinder(1992) for the United States who suggestedthat the federal funds rate is the appropriateshort-run measure of monetary policy ratherthan the growth of monetary aggregates Mostof the research involving interest rate spreadshowever has found that the level (or change)of a short rate has little marginal predictivecontent for output once spreads are included

Stock and Watson Forecasting Output and Inflation 793

The term spread and output growth Theterm spread is the difference between inter-est rates on long and short maturity debtusually government debt The literature onterm spreads uses different measures of thisspread the most common being a long gov-ernment bond rate minus a three-monthgovernment bill rate or instead the longbond rate minus an overnight rate (in theUnited States the federal funds rate)

The adage that an inverted yield curve sig-nals a recession was formalized empiricallyapparently independently by a number ofresearchers in the late 1980s includingRobert Laurent (1988 1989) CampbellHarvey (1988 1989) Stock and Watson(1989) Nai-Fu Chen (1991) and ArturoEstrella and Gikas Hardouvelis (1991) Thesestudies mainly focused on using the termspread to predict output growth (or in thecase of Harvey 1988 consumption growth)using US data Of these studies Estrella andHardouvelis (1991) provided the most com-prehensive documentation of the strong (in-sample) predictive content of the spread foroutput including its ability to predict a bi-nary recession indicator in probit regressionsThis early work focused on bivariate rela-tions with the exception of Stock and Watson(1989) who used in-sample statistics forbivariate and multivariate regressions toidentify the term spread and a default spread(the paper-bill spread discussed below) astwo historically potent leading indicators foroutput The work of Eugene Fama (1990)and Frederic Mishkin (1990ab) is alsonotable for they found that the term spreadhas (in-sample bivariate) predictive contentfor real rates especially at shorter horizons

Subsequent work focused on developingeconomic explanations for this relationdetermining whether it is stable over timewithin the United States and ascertainingwhether it holds up in international evidenceThe standard economic explanation for whythe term spread has predictive content foroutput is that the spread is an indicator of aneffective monetary policy monetary tighten-

ing results in short-term interest rates thatare high relative to long-term interest ratesand these high short rates in turn produce aneconomic slowdown (Bernanke and Blinder1992) Notably when placed within a multi-variate model the predictive content of theterm spread can change if monetary policychanges or the composition of economicshocks changes (Frank Smets and KostasTsatsaronis 1997) Movements in expectedfuture interest rates might not account for allthe predictive power of the term spreadhowever James Hamilton and Dong HeonKim (2002) suggested that the term premium(the term spread minus its predicted compo-nent under the expectations hypothesis of theterm structure of interest rates) has impor-tant predictive content for output as well

A closer examination of the US evidencehas led to the conclusion that the predictivecontent of the term spread for economicactivity has diminished since 1985 a pointmade using both pseudo out-of-sample androlling in-sample statistics by JosephHaubrich and Ann Dombrosky (1996) and byMichael Dotsey (1998) Similarly AndrewAng Monika Piazzesi and Min Wei (2003)found that during the 1990s US GDPgrowth is better predicted by the short ratethan the term spread In contrast researchthat focuses on predicting binary recessionevents (instead of output growth itself) sug-gests that the term spread might have hadsome link to the 1990 recession In particularthe ex post analyses of Estrella and Mishkin(1998) Kajal Lahiri and Jiazhou Wang (1996)and Michael Dueker (1997) respectively pro-vided probit and Markov switching modelsthat produce in-sample recession probabili-ties consistent with the term spread providingadvance warning of the 1990 US recessionThese estimated probabilities however werebased on estimated parameters that includethis recession so these are not real time orpseudo out-of-sample recession probabilities

The real-time evidence about the value ofthe spread as an indicator in the 1990 reces-sion is more mixed Laurent (1989) using the

794 Journal of Economic Literature Vol XLI (September 2003)

term spread predicted an imminent reces-sion in the United States Harvey (1989) pub-lished a forecast based on the yield curve thatsuggested ldquoa slowing of economic growth butnot zero or negative growthrdquo from the thirdquarter of 1989 through the third quarter of1990 and the Stock-Watson (1989) experi-mental recession index increased sharplywhen the yield curve flattened in late 1988and early 1989 However the business cyclepeak of July 1990 considerably postdates thepredicted period of these slowdowns asLaurent (1989) wrote ldquorecent spread datasuggest that the slowdown is likely to extendthrough the rest of 1989 and be quite signifi-cantrdquo Moreover Laurentrsquos (1989) forecastwas based in part on a judgmental interpreta-tion that the then-current inversion of theyield curve had special (nonlinear) signifi-cance signaling a downturn more severethan would be suggested by a linear modelIndeed even the largest predicted recessionprobabilities from the in-sample models aremodest 25 percent in Estrella and Mishkinrsquos(1998) probit model and 20 percent inDuekerrsquos (1997) Markov switching model forexample Harvey (1993) interpreted this evi-dence more favorably arguing that becausethe yield curve inverted moderately begin-ning in 1989II it correctly predicted a mod-erate recession six quarters later Our inter-pretation of this episode is that the termspread is an indicator of monetary policy thatmonetary policy was tight during late 1988and that yield-curve based models correctlypredicted a slowdown in 1989 This slow-down was not a recession however and theproximate cause of the recession of 1990 wasnot monetary tightening but rather specialnonmonetary circumstances in particular theinvasion of Kuwait by Iraq and the subse-quent response by US consumers (OlivierBlanchard 1993) This interpretation isbroadly similar to Benjamin Friedman andKenneth Kuttnerrsquos (1998) explanation of thefailure of the paper-bill spread to predict the1990 recession (discussed below)

Stock and Watson (2003b) examine the

behavior of various leading indicators beforeand during the US recession that began inMarch 2001 The term spread did turn neg-ative in advance of this recession the Fedfunds rate exceeded the long bond rate fromJune 2000 through March 2001 This inver-sion however was small by historical stan-dards While regressions of the form (3) pre-dicted a slower rate of economic growth inearly 2001 the predicted slowdown wasmodest four quarter growth forecasts basedon (3) fell 14 percentage points from 33percent in 2000I to a minimum of 19 per-cent in 2000IV still far from the negativegrowth of a recession

One way to get additional evidence on thereliability of the term spread as a predictor ofoutput growth is to examine evidence forother countries Harvey (1991) Zuliu Hu(1993) E Philip Davis and S G B Henry(1994) Charles Plosser and K GeertRouwenhorst (1994) Catherine Bonser-Nealand Timothy Morley (1997) Sharon Kozicki(1997) John Campbell (1999) Estrella andMishkin (1997) Estrella Anthony Rodriguesand Sebastian Schich (2003) and JosephAtta-Mensah and Greg Tkacz (2001) gener-ally conclude that the term spread has pre-dictive content for real output growth inmajor non-US OECD economies EstrellaRodrigues and Schich (2003) use in-samplebreak tests to assess coefficient stability of theforecasting relations and typically fail to rejectthe null hypothesis of stability in the cases inwhich the term spread has the greatest esti-mated predictive content (mainly long-horizon regressions) Additionally HenriBernard and Stefan Gerlach (1998) andEstrella Rodrigues and Schich (2003) pro-vide cross-country evidence on term spreadsas predictors of a binary recession indicatorfor seven non-US OECD countries Unlikemost of these papers Plosser andRouwenhorst (1994) considered multipleregressions that include the level and changeof interest rates and concluded that given thespread the short rate has little predictive con-tent for output in almost all the economies

Stock and Watson Forecasting Output and Inflation 795

they consider These studies typically used in-sample statistics and data sets that start in1970 or later Three exceptions to this gener-ally sanguine view are Davis and GabrielFagan (1997) Smets and Tsatsaronis (1997)and Fabio Canova and Gianni De Nicolo(2000) Using a pseudo out-of-sample fore-casting design Davis and Fagan (1997) findevidence of subsample instability and reportdisappointing pseudo out-of-sample forecast-ing performance across nine EU economiesSmets and Tsatsaronis (1997) find instabilityin the yield curvendashoutput relation in the1990s in the United States and GermanyCanova and De Nicolo (2000) using in-sample VAR statistics find only a limitedforecasting role for innovations to the termpremium in Germany Japan and theUnited Kingdom

Term spreads and inflation Many stud-ies including some of those already citedalso consider the predictive content of theterm spread for inflation According to theexpectations hypothesis of the term struc-ture of interest rates the forward rate (andthe term spread) should embody marketexpectations of future inflation and thefuture real rate With some notable excep-tions the papers in this literature generallyfind that there is little or no marginal infor-mation content in the nominal interest rateterm structure for future inflation Much ofthe early work which typically claims to findpredictive content did not control for laggedinflation In US data Mishkin (1990a)found no predictive content of term spreadsfor inflation at the short end of the yieldcurve although Mishkin (1990b) found pre-dictive content using spreads that involvelong bond rates Philippe Jorion and Mishkin(1991) and Mishkin (1991) reached similarconclusions using data on ten OECD coun-tries results confirmed by Gerlach (1997)for Germany using Mishkinrsquos methodologyDrawing on Jeffrey Frankelrsquos (1982) earlywork in this area Frankel and Cara Lown(1994) suggested a modification of the termspread based on a weighted average of dif-

ferent maturities that outperformed the sim-ple term spread in Mishkin-style regressions

Mishkinrsquos regressions have a single sto-chastic regressor the term spread (no lags)and in particular do not include lagged infla-tion Inflation is highly persistent howeverand Bernanke and Mishkin (1992) Estrellaand Mishkin (1997) and Kozicki (1997)examined the in-sample marginal predictivecontent of the term spread given laggedinflation Bernanke and Mishkin (1992)found little or no marginal predictive con-tent of the term spread for one-month-ahead inflation in a data set with six largeeconomies once lags of inflation are includ-ed Kozicki (1997) and Estrella and Mishkin(1997) included only a single lag of inflationbut even so they found that doing so sub-stantially reduced the marginal predictivecontent of the term spread for future infla-tion over one to two years For exampleonce lagged inflation is added Kozicki(1997) found that the spread remained sig-nificant for one-year inflation in only two ofthe ten OECD countries she studied inEstrella and Mishkinrsquos (1997) study the termspread was no longer a significant predictorat the one-year horizon in any of their fourcountries although they provided evidencefor predictive content at longer horizons

Default spreads Another strand ofresearch has focused on the predictive con-tent of default spreads primarily for realeconomic activity A default spread is the dif-ference between the interest rates onmatched maturity private debt with differentdegrees of default risk Different authorsmeasure this spread differently and thesedifferences are potentially importantBecause markets for private debt differ sub-stantially across countries and are mostdeveloped for the United States this workhas focused on the United States

In his study of the credit channel duringthe Great Depression Bernanke (1983)showed that during the interwar period theBaa-Treasury bond spread was a useful pre-dictor of industrial production growth Stock

796 Journal of Economic Literature Vol XLI (September 2003)

and Watson (1989) and Friedman andKuttner (1992) studied the default spread asa predictor of real growth in the postwarperiod they found that the spread betweencommercial paper and US Treasury bills ofthe same maturity (three or six months theldquopaper-billrdquo spread) was a potent predictorof output growth (monthly data 1959ndash88 forStock and Watson 1989 quarterly data1960ndash90 for Friedman and Kuttner 1992)Using in-sample statistics Friedman andKuttner (1992) concluded that upon con-trolling for the paper-bill spread monetaryaggregates and interest rates have little predictive content for real output a findingconfirmed by Bernanke and Blinder (1992)and Martin Feldstein and Stock (1994)

Subsequent literature focused on whetherthis predictive relationship is stable overtime Bernanke (1990) used in-sample statis-tics to confirm the strong performance ofpaper-bill spread as predictor of output butby splitting up the sample he also suggestedthat this relation weakened during the1980s This view was affirmed and assertedmore strongly by Mark Thoma and Jo AnnaGray (1994) Rik Hafer and Ali Kutan(1992) and Kenneth Emery (1996) Thomaand Gray (1994) for example found that thepaper-bill spread has strong in-sampleexplanatory power in recursive or rollingregressions but little predictive power inpseudo out-of-sample forecasting exercisesover the 1980s Emery (1996) finds little in-sample explanatory power of the paper-billspread in samples that postdate 1980 Theseauthors interpreted this as a consequence ofspecial events especially in 1973ndash74 whichcontribute to a good in-sample fit but notnecessarily good forecasting performanceDrawing on institutional considerationsJohn Duca (1999) also took this view Ducarsquos(1999) concerns echo Timothy Cookrsquos (1981)warnings about how the changing institu-tional environment and financial innovationscould substantially change markets for short-term debt and thereby alter the relationshipbetween default spreads and real activity

One obvious true out-of-sample predic-tive failure of the paper-bill spread is its fail-ure to rise sharply in advance of the 1990ndash91US recession In their post-mortemFriedman and Kuttner (1998) suggested thatthis predictive failure arose because the1990ndash91 recession was caused in large partby nonmonetary events that would not havebeen detected by the paper-bill spread Theyfurther argued that there were changes inthe commercial paper market unrelated tothe recession that also led to this predictivefailure Similarly the paper-bill spread failedto forecast the 2001 recession the paper-billspread had brief moderate spikes in October1998 October 1999 and June 2000 but thespread was small and declining from August2000 through the end of 2001 (Stock andWatson 2003b)

We know of little work examining the pre-dictive content of default spreads ineconomies other than the United StatesBernanke and Mishkin (1992) report a pre-liminary investigation but they questionedthe adequacy of their private debt interestrate data (the counterpart of the commercialpaper rate in the United States) for severalcountries Finding sufficiently long timeseries data on reliable market prices of suit-able private debt instruments has been abarrier to international comparisons on therole of the default spread

Some studies examined the predictivecontent of the default spread for inflationFriedman and Kuttner (1992) found littlepredictive content of the paper-bill spreadfor inflation using Granger causality testsConsistent with this Feldstein and Stock(1994) found that although the paper-billspread was a significant in-sample predictorof real GDP it did not significantly enterequations predicting nominal GDP

Four nonexclusive arguments have beenput forth on why the paperndashbill spread hadpredictive content for output growth duringthe 1960s and 1970s Stock and Watson(1989) suggested the predictive contentarises from expectations of default risk

Stock and Watson Forecasting Output and Inflation 797

which are in turn based on private expecta-tions of sales and profits Bernanke (1990)and Bernanke and Blinder (1992) arguedinstead that the paper-bill spread is a sensi-tive measure of monetary policy and this isthe main source of its predictive contentFriedman and Kuttner (1993ab) suggestedthat the spread is detecting influences ofsupply and demand (ie liquidity) in themarket for private debt this emphasis is sim-ilar to Cookrsquos (1981) attribution of move-ments in such spreads to supply and demandconsiderations Finally Thoma and Gray(1994) and Emery (1996) argued that thepredictive content is largely coincidental theconsequence of one-time events

There has been some examination of otherspreads in this literature Mark Gertler andLown (2000) take the view that because ofthe credit channel theory of monetary policytransmission the premise of using a defaultspread to predict future output is sound butthat the paper-bill spread is a flawed choicefor institutional reasons Instead they sug-gest using the high-yield bond (ldquojunkbondrdquo)ndashAaa spread instead The junk bondmarket was only developed in the 1980s inthe United States so this spread has a shorttime series Still Gertler and Lown (2000)present in-sample evidence that its explana-tory power was strong throughout this pe-riod This is notable because the paper-billspread (and as was noted above the termspread) have substantially reduced or no pre-dictive content for output growth in theUnited States during this period HoweverDucarsquos (1999) concerns about default spreadsin general extend to the junk bond-Aaaspread as well he suggests the spike in thejunk bond spread in the late 1980s and early1990s (which is key to this spreadrsquos signal ofthe 1990 recession) was a coincidental conse-quence of the aftermath of the thrift crisis inwhich thrifts were forced to sell their junkbond holdings in an illiquid market

Stock prices and dividend yields If theprice of a stock equals the expected dis-counted value of future earnings then stock

returns should be useful in forecasting earn-ings growth or more broadly output growthThe empirical link between stock prices andeconomic activity has been noted at leastsince Mitchell and Burns (1938) see StanleyFischer and Robert Merton (1984) andRobert Barro (1990) Upon closer inspec-tion however this link is murky Stockreturns generally do not have substantial in-sample predictive content for future outputeven in bivariate regressions with no laggeddependent variables (Fama 1981 and Harvey1989) and any predictive content is reducedby including lagged output growth Thisminimal marginal predictive content is foundboth in linear regressions predicting outputgrowth (Stock and Watson 1989 1999a) andin probit regressions of binary recessionevents (Estrella and Mishkin 1998)

In his review article Campbell (1999)shows that in a simple loglinear representa-tive agent model the log price-dividend ratioembodies rational discounted forecasts ofdividend growth rates and stock returnsmaking it an appropriate state variable to usefor forecasting But in his internationaldataset (fifteen countries sample periodsmainly 1970s to 1996 Campbell (1999)found that the log dividend price ratio has lit-tle predictive content for output This is con-sistent with the generally negative conclu-sions in the larger literature that examinesthe predictive content of stock returnsdirectly These generally negative findingsprovide a precise reprise of the witticism thatthe stock market has predicted nine of thelast four recessions

Campbell et al (2001) proposed an inter-esting variant in which the variance of stockreturns rather than the returns themselvescould have predictive content for outputgrowth Using in-sample statistics theyfound evidence that high volatility in onequarter signals low growth in the next quar-ter as it might if high volatility was associ-ated with increased doubts about short-termeconomic prospects When Hui Guo (2002)used out-of-sample statistics however the

798 Journal of Economic Literature Vol XLI (September 2003)

evidence for predictive content was substan-tially weaker These findings are consistentwith the predictive content of stock marketvolatility being stronger during someepisodes than during others

Few studies have examined the predictivecontent of stock prices for inflation One isCharles Goodhart and Boris Hofmann(2000a) who found that stock returns do nothave marginal predictive content for infla-tion in their international data set (twelvedeveloped economies quarterly data mainly1970ndash98 or shorter)

Other financial indicators Exchangerates are a channel through which inflationcan be imported in open economies In theUnited States exchange rates (or a measureof the terms of trade) have long entered con-ventional Phillips curves Robert Gordon(1982 1998) finds these exchange rates sta-tistically significant based on in-sample testsIn their international dataset howeverGoodhart and Hofmann (2000b) find thatpseudo out-of-sample forecasts of inflationusing exchange rates and lagged inflationoutperformed autoregressive forecasts inonly one or two of their seventeen countriesdepending on the horizon At least in theUS data there is also little evidence thatexchange rates predict output growth (egStock and Watson 1999a)

One problem with the nominal termstructure as a predictor of inflation is thatunder the expectations hypothesis the for-ward rate embodies forecasts of both infla-tion and future real rates In theory one caneliminate the expected future real rates byusing spreads between forward rates in theterm structures of nominal and real debt ofmatched maturity and matched bearer riskIn practice one of the few cases for whichthis is possible with time series of a reason-able length is for British index-linked bondsDavid Barr and Campbell (1997) investi-gated the (bivariate in-sample) predictivecontent of these implicit inflation expecta-tions and found that they had better predic-tive content for inflation than forward rates

obtained solely from the nominal term struc-ture They provided no evidence on Grangercausality or marginal predictive content ofthese implicit inflation expectations in multi-variate regressions

Lettau and Sydney Ludvigson (2001) pro-posed a novel indicator the log of the consumption-wealth ratio They argue thatin a representative consumer model with nostickiness in consumption the log ratio ofconsumption to total wealth (human andnonhuman) should predict the return on themarket portfolio They find empirically thattheir version of the consumptionndashwealthratio (a cointegrating residual between con-sumption of nondurables financial wealthand labor income all in logarithms) has pre-dictive content for multiyear stock returns(both real returns and excess returns) Ifconsumption is sticky it could also have pre-dictive content for consumption growthHowever Ludvigson and Charles Steindel(1999) and Lettau and Ludvigson (2000)find that this indicator does not predict con-sumption growth or income growth in theUnited States one quarter ahead

Housing constitutes a large component ofaggregate wealth and gets significant weightin the CPI in many countries More gener-ally housing is a volatile and cyclically sensi-tive sector and measures of real activity inthe housing sector are known to be usefulleading indicators of economic activity atleast in the United States (Stock and Watson1989 1999a) suggesting a broader channelby which housing prices might forecast realactivity inflation or both In the UnitedStates housing starts (a real quantity mea-sure) have some predictive content for infla-tion (Stock 1998 Stock and Watson 1999b)Studies of the predictive content of housingprices confront difficult data problems how-ever Goodhart and Hofmann (2000a) con-structed a housing price data set for twelveOECD countries (extended to seventeencountries in Goodhart and Hofmann(2000b) They found that residential housinginflation has significant in-sample marginal

Stock and Watson Forecasting Output and Inflation 799

predictive content for overall inflation in afew of the several countries they studyalthough in several countries they usedinterpolated annual data which makes thoseresults difficult to assess

Nonlinear models The foregoing discus-sion has focused on statistical models inwhich the forecasts are linear functions of thepredictors even the recession predictionmodels estimated using probit regressionsare essentially linear in the sense that the pre-dictive index (the argument of the probitfunction) is a linear function of the predic-tors Might the problems of instability andepisodically poor predictive content stemfrom inherent nonlinearities in the forecast-ing relation The evidence on this proposi-tion is limited and mixed Ted Jaditz LeighRiddick and Chera Sayers (1998) examinelinear and nonlinear models of US indus-trial production using asset price predictorsand conclude that combined nonlinear fore-casts improve upon simple linear modelsadditionally Greg Tkacz (2001) reportsimprovements of nonlinear models over lin-ear models for forecasting Canadian GDPOn the other hand John Galbraith and Tkacz(2000) find limited international evidence ofnonlinearity in the outputndashterm spread rela-tion Similarly in their pseudo out-of-samplecomparison of VAR to multivariate neuralnetwork forecasts Norman Swanson andHalbert White (1997) concluded that the lin-ear forecasts generally performed better forvarious measures of US economic activityand inflation Swanson and White (1995)reached similar conclusions when forecastinginterest rates Given this limited evidence wecannot rule out the possibility that the rightnonlinear model will produce stable and reli-able forecasts of output and inflation usinginterest rates but this ldquorightrdquo nonlinearmodel has yet to be found

32 Forecasts Using Nonfinancial Variables

The literature on forecasting output andinflation with nonfinancial variables is mas-

sive see Stock and Watson (1999a) for anextensive review of the US evidence Thissection highlights a few recent studies onthis topic

The use of nonfinancial variables to forecastinflation has to a large extent focused onidentifying suitable measures of output gapsthat is estimating generalized Phillips curvesIn the United States the unemployment-based Phillips curve with a constant non-accelerating inflation rate of unemployment(NAIRU) has recently been unstable predict-ing accelerating inflation during a time thatinflation was in fact low and steady or fallingThis instability has been widely documentedsee for example Gordon (1997 1998) andDouglas Staiger Stock and Watson (1997ab2001) One reaction to this instability has beento suggest that the NAIRU was falling in theUnited States during the 1990s Mechanicallythis keeps the unemployment-based Phillipscurve on track and it makes sense in the con-text of changes in the US labor market and inthe economy generally cf Lawrence Katzand Alan Krueger (1999) However an impre-cisely estimated time-varying NAIRU makesforecasting using the unemployment-basedPhillips curve problematic

A different reaction to this time variation inthe NAIRU has been to see if there are alter-native predictive relations that have beenmore stable Staiger Stock and Watson(1997a) considered 71 candidate leadingindicators of inflation both financial and non-financial (quarterly US) and in a similar butmore thorough exercise Stock and Watson(1999b) considered 167 candidate leadingindicators (monthly US) They found a fewindicators that have been stable predictors ofinflation the prime example being the capac-ity utilization rate Gordon (1998) and Stock(1998) confirmed the accuracy of recent USinflation forecasts based on the capacity uti-lization rate Stock and Watson (1999b) alsosuggested an alternative Phillips-curve-typeforecast based on a single aggregate activityindex computed using 85 individual measuresof real aggregate activity

800 Journal of Economic Literature Vol XLI (September 2003)

Recently Andrew Atkeson and LeeOhanian (2001) challenged the usefulness ofall inflation forecasts based on the Phillipscurve and its variants They showed that forthe United States from 1984 to 2001 pub-lished inflation forecasts and pseudo out-of-sample Phillips curve forecasts did not beata seasonal random walk forecast of annualinflation Their finding poses a significantchallenge to all attempts to forecast inflationand we return to it in our empirical analysis

The international evidence on the suitabil-ity of output gaps and the Phillips Curve forforecasting inflation is mixed Simple unem-ployment-based models with a constantNAIRU fail in Europe which is one way tostate the phenomenon of so-called hysteresisin the unemployment rate More sophisti-cated and flexible statistical tools for esti-mating the NAIRU can improve in-samplefits for the European data (Thomas Laubach2001) but their value for forecasting is ques-tionable because of imprecision in the esti-mated NAIRU at the end of the sampleSimilarly inflation forecasts based on an out-put gap rather than the unemployment rateface the practical problem of estimating thegap at the end of the sample which neces-sarily introduces a one-sided estimate andassociated imprecision Evidence inMassimiliano Marcellino Stock and Watson(2000) suggests that the ability of output gapmodels to forecast inflation in Europe ismore limited than in the United States

Finally there is evidence (from US data)that the inflation process itself as well as pre-dictive relations based on it is time-varyingWilliam Brainard and George Perry (2000)and Timothy Cogley and Thomas Sargent(2001 2002) suggested that the persistence ofUS inflation was high in the 1970s and 1980sbut subsequently declined although this con-clusion appears to be sensitive to the methodused to measure persistence (FredericPivetta and Ricardo Reis 2002) GeorgeAkerlof William Dickens and Perry (2000)provided a model based on near-rationalbehavior that motivates a nonlinear Phillips

curve which they interpreted as consistentwith the Brainard and Perry (2000) evidence

In a similar vein Stephen Cecchetti RitaChu and Charles Steindel (2000) performeda pseudo out-of-sample forecasting experi-ment on various candidate leading indicatorsof inflation from 1985 to 1998 in the UnitedStates including interest rates term anddefault spreads and several nonfinancialindicators They concluded that none ofthese indicators financial or nonfinancialreliably predicts inflation in bivariate fore-casting models and that there are very fewyears in which financial variables outperforma simple autoregression Because theyassessed performance on a year-by-yearbasis these findings have great samplingvariability and it is difficult to know howmuch of this is due to true instability Stilltheir findings are consistent with Stock andWatsonrsquos (1996) results based on formal sta-bility tests that time variation in thesereduced form bivariate predictive relationsis widespread in the US data

33 Discussion

An econometrician might quibble withsome aspects of this literature Many ofthese studies fail to include lagged endoge-nous variables and thus do not assess mar-ginal predictive content Results oftenchange when marginal predictive content isconsidered (the predictive content of theterm spread for inflation is an example)Many of the regressions involve overlappingreturns and when the overlap period is largerelative to the sample size the distribution ofin-sample t-statistics and R2s becomes non-standard Some regressors such as the divi-dend yield and the term spread are highlypersistent and even if they do not have aunit root this persistence causes conventionalinference methods to break down These latter two problems combined make it evenmore difficult to do reliable inference andvery few of these studies mention far lesstackle either of these difficulties with their

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 5: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

792 Journal of Economic Literature Vol XLI (September 2003)

(4)

where T1 and T2 ndash h are respectively the firstand last dates over which the pseudo out-of-sample forecast is computed (so that fore-casts are made for dates t 5 T1 1 hhellip T2)

If its relative MSFE is less than one thecandidate forecast is estimated to have per-formed better than the benchmark Ofcourse this could happen simply because ofsampling variability so to determinewhether a relative MSFE less than one is sta-tistically significant requires testing thehypothesis that the population relativeMSFE 5 1 against the alternative that it isless than one When neither model has anyestimated parameters this is done using themethods of Francis Diebold and RobertMariano (1995) Kenneth West (1996) treatsthe case that at least one model has esti-mated parameters and the models are notnested (that is the benchmark model is nota special case of model i) If as is the case inmuch of the literature we review the bench-mark is nested within model i then themethods developed by Michael McCracken(1999) and Todd Clark and McCracken(2001) apply These econometric methodshave been developed only recently so arealmost entirely absent from the literaturereviewed in section 3

In the empirical analysis in sections 6 and7 we use the relative mean squared forecasterror criterion in (4) because of its familiar-ity and ease of interpretation Many varia-tions on this method are available howeverAn alternative to the recursive estimationscheme outlined above is to use a fixed num-ber of observations (a rolling window) forestimating the forecasting model Alsosquared error loss can be replaced with a dif-ferent loss function for example one couldcompute relative mean absolute error Other

1T2 2 T1 2 h 1 1 S

T22h

t5T1

1Yht1h 2 Yh

it1h t22

1T2 2 T1 2 h 1 1 S

T22h

t5T1

1Yht1h 2 Yh

0 t1h t22

statistics such as statistics that test for fore-cast encompassing or ones that assess theaccuracy of the forecasted direction ofchange can be used in addition to a relativeerror measure (see M Hashem Pesaran andSpyros Skouras 2002 and McCracken andWest 2002 for a discussion of alternativeforecast evaluation statistics) For a textbookintroduction and worked empirical examplesof pseudo out-of-sample forecast compar-ison see Stock and Watson (2003a section126) For an introduction to the recent literature on forecast comparisons seeMcCracken and West (2002)

3 Literature Survey

This survey first reviews papers that useasset prices as predictors of inflation andoroutput growth then provides a brief selec-tive summary of recent developments usingnonfinancial indicators Although we men-tion some historical precedents our reviewfocuses on developments within the past fif-teen years The section concludes with anattempt to draw some general conclusionsfrom this literature

31 Forecasts Using Asset Prices

Interest rates Short-term interest rateshave a long history of use as predictors of out-put and inflation Notably using data for theUnited States Christopher Sims (1980) foundthat including the commercial paper rate invector autoregressions (VARs) with outputinflation and money eliminated the marginalpredictive content of money for real outputThis result has been confirmed in numerousstudies eg Ben Bernanke and Alan Blinder(1992) for the United States who suggestedthat the federal funds rate is the appropriateshort-run measure of monetary policy ratherthan the growth of monetary aggregates Mostof the research involving interest rate spreadshowever has found that the level (or change)of a short rate has little marginal predictivecontent for output once spreads are included

Stock and Watson Forecasting Output and Inflation 793

The term spread and output growth Theterm spread is the difference between inter-est rates on long and short maturity debtusually government debt The literature onterm spreads uses different measures of thisspread the most common being a long gov-ernment bond rate minus a three-monthgovernment bill rate or instead the longbond rate minus an overnight rate (in theUnited States the federal funds rate)

The adage that an inverted yield curve sig-nals a recession was formalized empiricallyapparently independently by a number ofresearchers in the late 1980s includingRobert Laurent (1988 1989) CampbellHarvey (1988 1989) Stock and Watson(1989) Nai-Fu Chen (1991) and ArturoEstrella and Gikas Hardouvelis (1991) Thesestudies mainly focused on using the termspread to predict output growth (or in thecase of Harvey 1988 consumption growth)using US data Of these studies Estrella andHardouvelis (1991) provided the most com-prehensive documentation of the strong (in-sample) predictive content of the spread foroutput including its ability to predict a bi-nary recession indicator in probit regressionsThis early work focused on bivariate rela-tions with the exception of Stock and Watson(1989) who used in-sample statistics forbivariate and multivariate regressions toidentify the term spread and a default spread(the paper-bill spread discussed below) astwo historically potent leading indicators foroutput The work of Eugene Fama (1990)and Frederic Mishkin (1990ab) is alsonotable for they found that the term spreadhas (in-sample bivariate) predictive contentfor real rates especially at shorter horizons

Subsequent work focused on developingeconomic explanations for this relationdetermining whether it is stable over timewithin the United States and ascertainingwhether it holds up in international evidenceThe standard economic explanation for whythe term spread has predictive content foroutput is that the spread is an indicator of aneffective monetary policy monetary tighten-

ing results in short-term interest rates thatare high relative to long-term interest ratesand these high short rates in turn produce aneconomic slowdown (Bernanke and Blinder1992) Notably when placed within a multi-variate model the predictive content of theterm spread can change if monetary policychanges or the composition of economicshocks changes (Frank Smets and KostasTsatsaronis 1997) Movements in expectedfuture interest rates might not account for allthe predictive power of the term spreadhowever James Hamilton and Dong HeonKim (2002) suggested that the term premium(the term spread minus its predicted compo-nent under the expectations hypothesis of theterm structure of interest rates) has impor-tant predictive content for output as well

A closer examination of the US evidencehas led to the conclusion that the predictivecontent of the term spread for economicactivity has diminished since 1985 a pointmade using both pseudo out-of-sample androlling in-sample statistics by JosephHaubrich and Ann Dombrosky (1996) and byMichael Dotsey (1998) Similarly AndrewAng Monika Piazzesi and Min Wei (2003)found that during the 1990s US GDPgrowth is better predicted by the short ratethan the term spread In contrast researchthat focuses on predicting binary recessionevents (instead of output growth itself) sug-gests that the term spread might have hadsome link to the 1990 recession In particularthe ex post analyses of Estrella and Mishkin(1998) Kajal Lahiri and Jiazhou Wang (1996)and Michael Dueker (1997) respectively pro-vided probit and Markov switching modelsthat produce in-sample recession probabili-ties consistent with the term spread providingadvance warning of the 1990 US recessionThese estimated probabilities however werebased on estimated parameters that includethis recession so these are not real time orpseudo out-of-sample recession probabilities

The real-time evidence about the value ofthe spread as an indicator in the 1990 reces-sion is more mixed Laurent (1989) using the

794 Journal of Economic Literature Vol XLI (September 2003)

term spread predicted an imminent reces-sion in the United States Harvey (1989) pub-lished a forecast based on the yield curve thatsuggested ldquoa slowing of economic growth butnot zero or negative growthrdquo from the thirdquarter of 1989 through the third quarter of1990 and the Stock-Watson (1989) experi-mental recession index increased sharplywhen the yield curve flattened in late 1988and early 1989 However the business cyclepeak of July 1990 considerably postdates thepredicted period of these slowdowns asLaurent (1989) wrote ldquorecent spread datasuggest that the slowdown is likely to extendthrough the rest of 1989 and be quite signifi-cantrdquo Moreover Laurentrsquos (1989) forecastwas based in part on a judgmental interpreta-tion that the then-current inversion of theyield curve had special (nonlinear) signifi-cance signaling a downturn more severethan would be suggested by a linear modelIndeed even the largest predicted recessionprobabilities from the in-sample models aremodest 25 percent in Estrella and Mishkinrsquos(1998) probit model and 20 percent inDuekerrsquos (1997) Markov switching model forexample Harvey (1993) interpreted this evi-dence more favorably arguing that becausethe yield curve inverted moderately begin-ning in 1989II it correctly predicted a mod-erate recession six quarters later Our inter-pretation of this episode is that the termspread is an indicator of monetary policy thatmonetary policy was tight during late 1988and that yield-curve based models correctlypredicted a slowdown in 1989 This slow-down was not a recession however and theproximate cause of the recession of 1990 wasnot monetary tightening but rather specialnonmonetary circumstances in particular theinvasion of Kuwait by Iraq and the subse-quent response by US consumers (OlivierBlanchard 1993) This interpretation isbroadly similar to Benjamin Friedman andKenneth Kuttnerrsquos (1998) explanation of thefailure of the paper-bill spread to predict the1990 recession (discussed below)

Stock and Watson (2003b) examine the

behavior of various leading indicators beforeand during the US recession that began inMarch 2001 The term spread did turn neg-ative in advance of this recession the Fedfunds rate exceeded the long bond rate fromJune 2000 through March 2001 This inver-sion however was small by historical stan-dards While regressions of the form (3) pre-dicted a slower rate of economic growth inearly 2001 the predicted slowdown wasmodest four quarter growth forecasts basedon (3) fell 14 percentage points from 33percent in 2000I to a minimum of 19 per-cent in 2000IV still far from the negativegrowth of a recession

One way to get additional evidence on thereliability of the term spread as a predictor ofoutput growth is to examine evidence forother countries Harvey (1991) Zuliu Hu(1993) E Philip Davis and S G B Henry(1994) Charles Plosser and K GeertRouwenhorst (1994) Catherine Bonser-Nealand Timothy Morley (1997) Sharon Kozicki(1997) John Campbell (1999) Estrella andMishkin (1997) Estrella Anthony Rodriguesand Sebastian Schich (2003) and JosephAtta-Mensah and Greg Tkacz (2001) gener-ally conclude that the term spread has pre-dictive content for real output growth inmajor non-US OECD economies EstrellaRodrigues and Schich (2003) use in-samplebreak tests to assess coefficient stability of theforecasting relations and typically fail to rejectthe null hypothesis of stability in the cases inwhich the term spread has the greatest esti-mated predictive content (mainly long-horizon regressions) Additionally HenriBernard and Stefan Gerlach (1998) andEstrella Rodrigues and Schich (2003) pro-vide cross-country evidence on term spreadsas predictors of a binary recession indicatorfor seven non-US OECD countries Unlikemost of these papers Plosser andRouwenhorst (1994) considered multipleregressions that include the level and changeof interest rates and concluded that given thespread the short rate has little predictive con-tent for output in almost all the economies

Stock and Watson Forecasting Output and Inflation 795

they consider These studies typically used in-sample statistics and data sets that start in1970 or later Three exceptions to this gener-ally sanguine view are Davis and GabrielFagan (1997) Smets and Tsatsaronis (1997)and Fabio Canova and Gianni De Nicolo(2000) Using a pseudo out-of-sample fore-casting design Davis and Fagan (1997) findevidence of subsample instability and reportdisappointing pseudo out-of-sample forecast-ing performance across nine EU economiesSmets and Tsatsaronis (1997) find instabilityin the yield curvendashoutput relation in the1990s in the United States and GermanyCanova and De Nicolo (2000) using in-sample VAR statistics find only a limitedforecasting role for innovations to the termpremium in Germany Japan and theUnited Kingdom

Term spreads and inflation Many stud-ies including some of those already citedalso consider the predictive content of theterm spread for inflation According to theexpectations hypothesis of the term struc-ture of interest rates the forward rate (andthe term spread) should embody marketexpectations of future inflation and thefuture real rate With some notable excep-tions the papers in this literature generallyfind that there is little or no marginal infor-mation content in the nominal interest rateterm structure for future inflation Much ofthe early work which typically claims to findpredictive content did not control for laggedinflation In US data Mishkin (1990a)found no predictive content of term spreadsfor inflation at the short end of the yieldcurve although Mishkin (1990b) found pre-dictive content using spreads that involvelong bond rates Philippe Jorion and Mishkin(1991) and Mishkin (1991) reached similarconclusions using data on ten OECD coun-tries results confirmed by Gerlach (1997)for Germany using Mishkinrsquos methodologyDrawing on Jeffrey Frankelrsquos (1982) earlywork in this area Frankel and Cara Lown(1994) suggested a modification of the termspread based on a weighted average of dif-

ferent maturities that outperformed the sim-ple term spread in Mishkin-style regressions

Mishkinrsquos regressions have a single sto-chastic regressor the term spread (no lags)and in particular do not include lagged infla-tion Inflation is highly persistent howeverand Bernanke and Mishkin (1992) Estrellaand Mishkin (1997) and Kozicki (1997)examined the in-sample marginal predictivecontent of the term spread given laggedinflation Bernanke and Mishkin (1992)found little or no marginal predictive con-tent of the term spread for one-month-ahead inflation in a data set with six largeeconomies once lags of inflation are includ-ed Kozicki (1997) and Estrella and Mishkin(1997) included only a single lag of inflationbut even so they found that doing so sub-stantially reduced the marginal predictivecontent of the term spread for future infla-tion over one to two years For exampleonce lagged inflation is added Kozicki(1997) found that the spread remained sig-nificant for one-year inflation in only two ofthe ten OECD countries she studied inEstrella and Mishkinrsquos (1997) study the termspread was no longer a significant predictorat the one-year horizon in any of their fourcountries although they provided evidencefor predictive content at longer horizons

Default spreads Another strand ofresearch has focused on the predictive con-tent of default spreads primarily for realeconomic activity A default spread is the dif-ference between the interest rates onmatched maturity private debt with differentdegrees of default risk Different authorsmeasure this spread differently and thesedifferences are potentially importantBecause markets for private debt differ sub-stantially across countries and are mostdeveloped for the United States this workhas focused on the United States

In his study of the credit channel duringthe Great Depression Bernanke (1983)showed that during the interwar period theBaa-Treasury bond spread was a useful pre-dictor of industrial production growth Stock

796 Journal of Economic Literature Vol XLI (September 2003)

and Watson (1989) and Friedman andKuttner (1992) studied the default spread asa predictor of real growth in the postwarperiod they found that the spread betweencommercial paper and US Treasury bills ofthe same maturity (three or six months theldquopaper-billrdquo spread) was a potent predictorof output growth (monthly data 1959ndash88 forStock and Watson 1989 quarterly data1960ndash90 for Friedman and Kuttner 1992)Using in-sample statistics Friedman andKuttner (1992) concluded that upon con-trolling for the paper-bill spread monetaryaggregates and interest rates have little predictive content for real output a findingconfirmed by Bernanke and Blinder (1992)and Martin Feldstein and Stock (1994)

Subsequent literature focused on whetherthis predictive relationship is stable overtime Bernanke (1990) used in-sample statis-tics to confirm the strong performance ofpaper-bill spread as predictor of output butby splitting up the sample he also suggestedthat this relation weakened during the1980s This view was affirmed and assertedmore strongly by Mark Thoma and Jo AnnaGray (1994) Rik Hafer and Ali Kutan(1992) and Kenneth Emery (1996) Thomaand Gray (1994) for example found that thepaper-bill spread has strong in-sampleexplanatory power in recursive or rollingregressions but little predictive power inpseudo out-of-sample forecasting exercisesover the 1980s Emery (1996) finds little in-sample explanatory power of the paper-billspread in samples that postdate 1980 Theseauthors interpreted this as a consequence ofspecial events especially in 1973ndash74 whichcontribute to a good in-sample fit but notnecessarily good forecasting performanceDrawing on institutional considerationsJohn Duca (1999) also took this view Ducarsquos(1999) concerns echo Timothy Cookrsquos (1981)warnings about how the changing institu-tional environment and financial innovationscould substantially change markets for short-term debt and thereby alter the relationshipbetween default spreads and real activity

One obvious true out-of-sample predic-tive failure of the paper-bill spread is its fail-ure to rise sharply in advance of the 1990ndash91US recession In their post-mortemFriedman and Kuttner (1998) suggested thatthis predictive failure arose because the1990ndash91 recession was caused in large partby nonmonetary events that would not havebeen detected by the paper-bill spread Theyfurther argued that there were changes inthe commercial paper market unrelated tothe recession that also led to this predictivefailure Similarly the paper-bill spread failedto forecast the 2001 recession the paper-billspread had brief moderate spikes in October1998 October 1999 and June 2000 but thespread was small and declining from August2000 through the end of 2001 (Stock andWatson 2003b)

We know of little work examining the pre-dictive content of default spreads ineconomies other than the United StatesBernanke and Mishkin (1992) report a pre-liminary investigation but they questionedthe adequacy of their private debt interestrate data (the counterpart of the commercialpaper rate in the United States) for severalcountries Finding sufficiently long timeseries data on reliable market prices of suit-able private debt instruments has been abarrier to international comparisons on therole of the default spread

Some studies examined the predictivecontent of the default spread for inflationFriedman and Kuttner (1992) found littlepredictive content of the paper-bill spreadfor inflation using Granger causality testsConsistent with this Feldstein and Stock(1994) found that although the paper-billspread was a significant in-sample predictorof real GDP it did not significantly enterequations predicting nominal GDP

Four nonexclusive arguments have beenput forth on why the paperndashbill spread hadpredictive content for output growth duringthe 1960s and 1970s Stock and Watson(1989) suggested the predictive contentarises from expectations of default risk

Stock and Watson Forecasting Output and Inflation 797

which are in turn based on private expecta-tions of sales and profits Bernanke (1990)and Bernanke and Blinder (1992) arguedinstead that the paper-bill spread is a sensi-tive measure of monetary policy and this isthe main source of its predictive contentFriedman and Kuttner (1993ab) suggestedthat the spread is detecting influences ofsupply and demand (ie liquidity) in themarket for private debt this emphasis is sim-ilar to Cookrsquos (1981) attribution of move-ments in such spreads to supply and demandconsiderations Finally Thoma and Gray(1994) and Emery (1996) argued that thepredictive content is largely coincidental theconsequence of one-time events

There has been some examination of otherspreads in this literature Mark Gertler andLown (2000) take the view that because ofthe credit channel theory of monetary policytransmission the premise of using a defaultspread to predict future output is sound butthat the paper-bill spread is a flawed choicefor institutional reasons Instead they sug-gest using the high-yield bond (ldquojunkbondrdquo)ndashAaa spread instead The junk bondmarket was only developed in the 1980s inthe United States so this spread has a shorttime series Still Gertler and Lown (2000)present in-sample evidence that its explana-tory power was strong throughout this pe-riod This is notable because the paper-billspread (and as was noted above the termspread) have substantially reduced or no pre-dictive content for output growth in theUnited States during this period HoweverDucarsquos (1999) concerns about default spreadsin general extend to the junk bond-Aaaspread as well he suggests the spike in thejunk bond spread in the late 1980s and early1990s (which is key to this spreadrsquos signal ofthe 1990 recession) was a coincidental conse-quence of the aftermath of the thrift crisis inwhich thrifts were forced to sell their junkbond holdings in an illiquid market

Stock prices and dividend yields If theprice of a stock equals the expected dis-counted value of future earnings then stock

returns should be useful in forecasting earn-ings growth or more broadly output growthThe empirical link between stock prices andeconomic activity has been noted at leastsince Mitchell and Burns (1938) see StanleyFischer and Robert Merton (1984) andRobert Barro (1990) Upon closer inspec-tion however this link is murky Stockreturns generally do not have substantial in-sample predictive content for future outputeven in bivariate regressions with no laggeddependent variables (Fama 1981 and Harvey1989) and any predictive content is reducedby including lagged output growth Thisminimal marginal predictive content is foundboth in linear regressions predicting outputgrowth (Stock and Watson 1989 1999a) andin probit regressions of binary recessionevents (Estrella and Mishkin 1998)

In his review article Campbell (1999)shows that in a simple loglinear representa-tive agent model the log price-dividend ratioembodies rational discounted forecasts ofdividend growth rates and stock returnsmaking it an appropriate state variable to usefor forecasting But in his internationaldataset (fifteen countries sample periodsmainly 1970s to 1996 Campbell (1999)found that the log dividend price ratio has lit-tle predictive content for output This is con-sistent with the generally negative conclu-sions in the larger literature that examinesthe predictive content of stock returnsdirectly These generally negative findingsprovide a precise reprise of the witticism thatthe stock market has predicted nine of thelast four recessions

Campbell et al (2001) proposed an inter-esting variant in which the variance of stockreturns rather than the returns themselvescould have predictive content for outputgrowth Using in-sample statistics theyfound evidence that high volatility in onequarter signals low growth in the next quar-ter as it might if high volatility was associ-ated with increased doubts about short-termeconomic prospects When Hui Guo (2002)used out-of-sample statistics however the

798 Journal of Economic Literature Vol XLI (September 2003)

evidence for predictive content was substan-tially weaker These findings are consistentwith the predictive content of stock marketvolatility being stronger during someepisodes than during others

Few studies have examined the predictivecontent of stock prices for inflation One isCharles Goodhart and Boris Hofmann(2000a) who found that stock returns do nothave marginal predictive content for infla-tion in their international data set (twelvedeveloped economies quarterly data mainly1970ndash98 or shorter)

Other financial indicators Exchangerates are a channel through which inflationcan be imported in open economies In theUnited States exchange rates (or a measureof the terms of trade) have long entered con-ventional Phillips curves Robert Gordon(1982 1998) finds these exchange rates sta-tistically significant based on in-sample testsIn their international dataset howeverGoodhart and Hofmann (2000b) find thatpseudo out-of-sample forecasts of inflationusing exchange rates and lagged inflationoutperformed autoregressive forecasts inonly one or two of their seventeen countriesdepending on the horizon At least in theUS data there is also little evidence thatexchange rates predict output growth (egStock and Watson 1999a)

One problem with the nominal termstructure as a predictor of inflation is thatunder the expectations hypothesis the for-ward rate embodies forecasts of both infla-tion and future real rates In theory one caneliminate the expected future real rates byusing spreads between forward rates in theterm structures of nominal and real debt ofmatched maturity and matched bearer riskIn practice one of the few cases for whichthis is possible with time series of a reason-able length is for British index-linked bondsDavid Barr and Campbell (1997) investi-gated the (bivariate in-sample) predictivecontent of these implicit inflation expecta-tions and found that they had better predic-tive content for inflation than forward rates

obtained solely from the nominal term struc-ture They provided no evidence on Grangercausality or marginal predictive content ofthese implicit inflation expectations in multi-variate regressions

Lettau and Sydney Ludvigson (2001) pro-posed a novel indicator the log of the consumption-wealth ratio They argue thatin a representative consumer model with nostickiness in consumption the log ratio ofconsumption to total wealth (human andnonhuman) should predict the return on themarket portfolio They find empirically thattheir version of the consumptionndashwealthratio (a cointegrating residual between con-sumption of nondurables financial wealthand labor income all in logarithms) has pre-dictive content for multiyear stock returns(both real returns and excess returns) Ifconsumption is sticky it could also have pre-dictive content for consumption growthHowever Ludvigson and Charles Steindel(1999) and Lettau and Ludvigson (2000)find that this indicator does not predict con-sumption growth or income growth in theUnited States one quarter ahead

Housing constitutes a large component ofaggregate wealth and gets significant weightin the CPI in many countries More gener-ally housing is a volatile and cyclically sensi-tive sector and measures of real activity inthe housing sector are known to be usefulleading indicators of economic activity atleast in the United States (Stock and Watson1989 1999a) suggesting a broader channelby which housing prices might forecast realactivity inflation or both In the UnitedStates housing starts (a real quantity mea-sure) have some predictive content for infla-tion (Stock 1998 Stock and Watson 1999b)Studies of the predictive content of housingprices confront difficult data problems how-ever Goodhart and Hofmann (2000a) con-structed a housing price data set for twelveOECD countries (extended to seventeencountries in Goodhart and Hofmann(2000b) They found that residential housinginflation has significant in-sample marginal

Stock and Watson Forecasting Output and Inflation 799

predictive content for overall inflation in afew of the several countries they studyalthough in several countries they usedinterpolated annual data which makes thoseresults difficult to assess

Nonlinear models The foregoing discus-sion has focused on statistical models inwhich the forecasts are linear functions of thepredictors even the recession predictionmodels estimated using probit regressionsare essentially linear in the sense that the pre-dictive index (the argument of the probitfunction) is a linear function of the predic-tors Might the problems of instability andepisodically poor predictive content stemfrom inherent nonlinearities in the forecast-ing relation The evidence on this proposi-tion is limited and mixed Ted Jaditz LeighRiddick and Chera Sayers (1998) examinelinear and nonlinear models of US indus-trial production using asset price predictorsand conclude that combined nonlinear fore-casts improve upon simple linear modelsadditionally Greg Tkacz (2001) reportsimprovements of nonlinear models over lin-ear models for forecasting Canadian GDPOn the other hand John Galbraith and Tkacz(2000) find limited international evidence ofnonlinearity in the outputndashterm spread rela-tion Similarly in their pseudo out-of-samplecomparison of VAR to multivariate neuralnetwork forecasts Norman Swanson andHalbert White (1997) concluded that the lin-ear forecasts generally performed better forvarious measures of US economic activityand inflation Swanson and White (1995)reached similar conclusions when forecastinginterest rates Given this limited evidence wecannot rule out the possibility that the rightnonlinear model will produce stable and reli-able forecasts of output and inflation usinginterest rates but this ldquorightrdquo nonlinearmodel has yet to be found

32 Forecasts Using Nonfinancial Variables

The literature on forecasting output andinflation with nonfinancial variables is mas-

sive see Stock and Watson (1999a) for anextensive review of the US evidence Thissection highlights a few recent studies onthis topic

The use of nonfinancial variables to forecastinflation has to a large extent focused onidentifying suitable measures of output gapsthat is estimating generalized Phillips curvesIn the United States the unemployment-based Phillips curve with a constant non-accelerating inflation rate of unemployment(NAIRU) has recently been unstable predict-ing accelerating inflation during a time thatinflation was in fact low and steady or fallingThis instability has been widely documentedsee for example Gordon (1997 1998) andDouglas Staiger Stock and Watson (1997ab2001) One reaction to this instability has beento suggest that the NAIRU was falling in theUnited States during the 1990s Mechanicallythis keeps the unemployment-based Phillipscurve on track and it makes sense in the con-text of changes in the US labor market and inthe economy generally cf Lawrence Katzand Alan Krueger (1999) However an impre-cisely estimated time-varying NAIRU makesforecasting using the unemployment-basedPhillips curve problematic

A different reaction to this time variation inthe NAIRU has been to see if there are alter-native predictive relations that have beenmore stable Staiger Stock and Watson(1997a) considered 71 candidate leadingindicators of inflation both financial and non-financial (quarterly US) and in a similar butmore thorough exercise Stock and Watson(1999b) considered 167 candidate leadingindicators (monthly US) They found a fewindicators that have been stable predictors ofinflation the prime example being the capac-ity utilization rate Gordon (1998) and Stock(1998) confirmed the accuracy of recent USinflation forecasts based on the capacity uti-lization rate Stock and Watson (1999b) alsosuggested an alternative Phillips-curve-typeforecast based on a single aggregate activityindex computed using 85 individual measuresof real aggregate activity

800 Journal of Economic Literature Vol XLI (September 2003)

Recently Andrew Atkeson and LeeOhanian (2001) challenged the usefulness ofall inflation forecasts based on the Phillipscurve and its variants They showed that forthe United States from 1984 to 2001 pub-lished inflation forecasts and pseudo out-of-sample Phillips curve forecasts did not beata seasonal random walk forecast of annualinflation Their finding poses a significantchallenge to all attempts to forecast inflationand we return to it in our empirical analysis

The international evidence on the suitabil-ity of output gaps and the Phillips Curve forforecasting inflation is mixed Simple unem-ployment-based models with a constantNAIRU fail in Europe which is one way tostate the phenomenon of so-called hysteresisin the unemployment rate More sophisti-cated and flexible statistical tools for esti-mating the NAIRU can improve in-samplefits for the European data (Thomas Laubach2001) but their value for forecasting is ques-tionable because of imprecision in the esti-mated NAIRU at the end of the sampleSimilarly inflation forecasts based on an out-put gap rather than the unemployment rateface the practical problem of estimating thegap at the end of the sample which neces-sarily introduces a one-sided estimate andassociated imprecision Evidence inMassimiliano Marcellino Stock and Watson(2000) suggests that the ability of output gapmodels to forecast inflation in Europe ismore limited than in the United States

Finally there is evidence (from US data)that the inflation process itself as well as pre-dictive relations based on it is time-varyingWilliam Brainard and George Perry (2000)and Timothy Cogley and Thomas Sargent(2001 2002) suggested that the persistence ofUS inflation was high in the 1970s and 1980sbut subsequently declined although this con-clusion appears to be sensitive to the methodused to measure persistence (FredericPivetta and Ricardo Reis 2002) GeorgeAkerlof William Dickens and Perry (2000)provided a model based on near-rationalbehavior that motivates a nonlinear Phillips

curve which they interpreted as consistentwith the Brainard and Perry (2000) evidence

In a similar vein Stephen Cecchetti RitaChu and Charles Steindel (2000) performeda pseudo out-of-sample forecasting experi-ment on various candidate leading indicatorsof inflation from 1985 to 1998 in the UnitedStates including interest rates term anddefault spreads and several nonfinancialindicators They concluded that none ofthese indicators financial or nonfinancialreliably predicts inflation in bivariate fore-casting models and that there are very fewyears in which financial variables outperforma simple autoregression Because theyassessed performance on a year-by-yearbasis these findings have great samplingvariability and it is difficult to know howmuch of this is due to true instability Stilltheir findings are consistent with Stock andWatsonrsquos (1996) results based on formal sta-bility tests that time variation in thesereduced form bivariate predictive relationsis widespread in the US data

33 Discussion

An econometrician might quibble withsome aspects of this literature Many ofthese studies fail to include lagged endoge-nous variables and thus do not assess mar-ginal predictive content Results oftenchange when marginal predictive content isconsidered (the predictive content of theterm spread for inflation is an example)Many of the regressions involve overlappingreturns and when the overlap period is largerelative to the sample size the distribution ofin-sample t-statistics and R2s becomes non-standard Some regressors such as the divi-dend yield and the term spread are highlypersistent and even if they do not have aunit root this persistence causes conventionalinference methods to break down These latter two problems combined make it evenmore difficult to do reliable inference andvery few of these studies mention far lesstackle either of these difficulties with their

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 6: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 793

The term spread and output growth Theterm spread is the difference between inter-est rates on long and short maturity debtusually government debt The literature onterm spreads uses different measures of thisspread the most common being a long gov-ernment bond rate minus a three-monthgovernment bill rate or instead the longbond rate minus an overnight rate (in theUnited States the federal funds rate)

The adage that an inverted yield curve sig-nals a recession was formalized empiricallyapparently independently by a number ofresearchers in the late 1980s includingRobert Laurent (1988 1989) CampbellHarvey (1988 1989) Stock and Watson(1989) Nai-Fu Chen (1991) and ArturoEstrella and Gikas Hardouvelis (1991) Thesestudies mainly focused on using the termspread to predict output growth (or in thecase of Harvey 1988 consumption growth)using US data Of these studies Estrella andHardouvelis (1991) provided the most com-prehensive documentation of the strong (in-sample) predictive content of the spread foroutput including its ability to predict a bi-nary recession indicator in probit regressionsThis early work focused on bivariate rela-tions with the exception of Stock and Watson(1989) who used in-sample statistics forbivariate and multivariate regressions toidentify the term spread and a default spread(the paper-bill spread discussed below) astwo historically potent leading indicators foroutput The work of Eugene Fama (1990)and Frederic Mishkin (1990ab) is alsonotable for they found that the term spreadhas (in-sample bivariate) predictive contentfor real rates especially at shorter horizons

Subsequent work focused on developingeconomic explanations for this relationdetermining whether it is stable over timewithin the United States and ascertainingwhether it holds up in international evidenceThe standard economic explanation for whythe term spread has predictive content foroutput is that the spread is an indicator of aneffective monetary policy monetary tighten-

ing results in short-term interest rates thatare high relative to long-term interest ratesand these high short rates in turn produce aneconomic slowdown (Bernanke and Blinder1992) Notably when placed within a multi-variate model the predictive content of theterm spread can change if monetary policychanges or the composition of economicshocks changes (Frank Smets and KostasTsatsaronis 1997) Movements in expectedfuture interest rates might not account for allthe predictive power of the term spreadhowever James Hamilton and Dong HeonKim (2002) suggested that the term premium(the term spread minus its predicted compo-nent under the expectations hypothesis of theterm structure of interest rates) has impor-tant predictive content for output as well

A closer examination of the US evidencehas led to the conclusion that the predictivecontent of the term spread for economicactivity has diminished since 1985 a pointmade using both pseudo out-of-sample androlling in-sample statistics by JosephHaubrich and Ann Dombrosky (1996) and byMichael Dotsey (1998) Similarly AndrewAng Monika Piazzesi and Min Wei (2003)found that during the 1990s US GDPgrowth is better predicted by the short ratethan the term spread In contrast researchthat focuses on predicting binary recessionevents (instead of output growth itself) sug-gests that the term spread might have hadsome link to the 1990 recession In particularthe ex post analyses of Estrella and Mishkin(1998) Kajal Lahiri and Jiazhou Wang (1996)and Michael Dueker (1997) respectively pro-vided probit and Markov switching modelsthat produce in-sample recession probabili-ties consistent with the term spread providingadvance warning of the 1990 US recessionThese estimated probabilities however werebased on estimated parameters that includethis recession so these are not real time orpseudo out-of-sample recession probabilities

The real-time evidence about the value ofthe spread as an indicator in the 1990 reces-sion is more mixed Laurent (1989) using the

794 Journal of Economic Literature Vol XLI (September 2003)

term spread predicted an imminent reces-sion in the United States Harvey (1989) pub-lished a forecast based on the yield curve thatsuggested ldquoa slowing of economic growth butnot zero or negative growthrdquo from the thirdquarter of 1989 through the third quarter of1990 and the Stock-Watson (1989) experi-mental recession index increased sharplywhen the yield curve flattened in late 1988and early 1989 However the business cyclepeak of July 1990 considerably postdates thepredicted period of these slowdowns asLaurent (1989) wrote ldquorecent spread datasuggest that the slowdown is likely to extendthrough the rest of 1989 and be quite signifi-cantrdquo Moreover Laurentrsquos (1989) forecastwas based in part on a judgmental interpreta-tion that the then-current inversion of theyield curve had special (nonlinear) signifi-cance signaling a downturn more severethan would be suggested by a linear modelIndeed even the largest predicted recessionprobabilities from the in-sample models aremodest 25 percent in Estrella and Mishkinrsquos(1998) probit model and 20 percent inDuekerrsquos (1997) Markov switching model forexample Harvey (1993) interpreted this evi-dence more favorably arguing that becausethe yield curve inverted moderately begin-ning in 1989II it correctly predicted a mod-erate recession six quarters later Our inter-pretation of this episode is that the termspread is an indicator of monetary policy thatmonetary policy was tight during late 1988and that yield-curve based models correctlypredicted a slowdown in 1989 This slow-down was not a recession however and theproximate cause of the recession of 1990 wasnot monetary tightening but rather specialnonmonetary circumstances in particular theinvasion of Kuwait by Iraq and the subse-quent response by US consumers (OlivierBlanchard 1993) This interpretation isbroadly similar to Benjamin Friedman andKenneth Kuttnerrsquos (1998) explanation of thefailure of the paper-bill spread to predict the1990 recession (discussed below)

Stock and Watson (2003b) examine the

behavior of various leading indicators beforeand during the US recession that began inMarch 2001 The term spread did turn neg-ative in advance of this recession the Fedfunds rate exceeded the long bond rate fromJune 2000 through March 2001 This inver-sion however was small by historical stan-dards While regressions of the form (3) pre-dicted a slower rate of economic growth inearly 2001 the predicted slowdown wasmodest four quarter growth forecasts basedon (3) fell 14 percentage points from 33percent in 2000I to a minimum of 19 per-cent in 2000IV still far from the negativegrowth of a recession

One way to get additional evidence on thereliability of the term spread as a predictor ofoutput growth is to examine evidence forother countries Harvey (1991) Zuliu Hu(1993) E Philip Davis and S G B Henry(1994) Charles Plosser and K GeertRouwenhorst (1994) Catherine Bonser-Nealand Timothy Morley (1997) Sharon Kozicki(1997) John Campbell (1999) Estrella andMishkin (1997) Estrella Anthony Rodriguesand Sebastian Schich (2003) and JosephAtta-Mensah and Greg Tkacz (2001) gener-ally conclude that the term spread has pre-dictive content for real output growth inmajor non-US OECD economies EstrellaRodrigues and Schich (2003) use in-samplebreak tests to assess coefficient stability of theforecasting relations and typically fail to rejectthe null hypothesis of stability in the cases inwhich the term spread has the greatest esti-mated predictive content (mainly long-horizon regressions) Additionally HenriBernard and Stefan Gerlach (1998) andEstrella Rodrigues and Schich (2003) pro-vide cross-country evidence on term spreadsas predictors of a binary recession indicatorfor seven non-US OECD countries Unlikemost of these papers Plosser andRouwenhorst (1994) considered multipleregressions that include the level and changeof interest rates and concluded that given thespread the short rate has little predictive con-tent for output in almost all the economies

Stock and Watson Forecasting Output and Inflation 795

they consider These studies typically used in-sample statistics and data sets that start in1970 or later Three exceptions to this gener-ally sanguine view are Davis and GabrielFagan (1997) Smets and Tsatsaronis (1997)and Fabio Canova and Gianni De Nicolo(2000) Using a pseudo out-of-sample fore-casting design Davis and Fagan (1997) findevidence of subsample instability and reportdisappointing pseudo out-of-sample forecast-ing performance across nine EU economiesSmets and Tsatsaronis (1997) find instabilityin the yield curvendashoutput relation in the1990s in the United States and GermanyCanova and De Nicolo (2000) using in-sample VAR statistics find only a limitedforecasting role for innovations to the termpremium in Germany Japan and theUnited Kingdom

Term spreads and inflation Many stud-ies including some of those already citedalso consider the predictive content of theterm spread for inflation According to theexpectations hypothesis of the term struc-ture of interest rates the forward rate (andthe term spread) should embody marketexpectations of future inflation and thefuture real rate With some notable excep-tions the papers in this literature generallyfind that there is little or no marginal infor-mation content in the nominal interest rateterm structure for future inflation Much ofthe early work which typically claims to findpredictive content did not control for laggedinflation In US data Mishkin (1990a)found no predictive content of term spreadsfor inflation at the short end of the yieldcurve although Mishkin (1990b) found pre-dictive content using spreads that involvelong bond rates Philippe Jorion and Mishkin(1991) and Mishkin (1991) reached similarconclusions using data on ten OECD coun-tries results confirmed by Gerlach (1997)for Germany using Mishkinrsquos methodologyDrawing on Jeffrey Frankelrsquos (1982) earlywork in this area Frankel and Cara Lown(1994) suggested a modification of the termspread based on a weighted average of dif-

ferent maturities that outperformed the sim-ple term spread in Mishkin-style regressions

Mishkinrsquos regressions have a single sto-chastic regressor the term spread (no lags)and in particular do not include lagged infla-tion Inflation is highly persistent howeverand Bernanke and Mishkin (1992) Estrellaand Mishkin (1997) and Kozicki (1997)examined the in-sample marginal predictivecontent of the term spread given laggedinflation Bernanke and Mishkin (1992)found little or no marginal predictive con-tent of the term spread for one-month-ahead inflation in a data set with six largeeconomies once lags of inflation are includ-ed Kozicki (1997) and Estrella and Mishkin(1997) included only a single lag of inflationbut even so they found that doing so sub-stantially reduced the marginal predictivecontent of the term spread for future infla-tion over one to two years For exampleonce lagged inflation is added Kozicki(1997) found that the spread remained sig-nificant for one-year inflation in only two ofthe ten OECD countries she studied inEstrella and Mishkinrsquos (1997) study the termspread was no longer a significant predictorat the one-year horizon in any of their fourcountries although they provided evidencefor predictive content at longer horizons

Default spreads Another strand ofresearch has focused on the predictive con-tent of default spreads primarily for realeconomic activity A default spread is the dif-ference between the interest rates onmatched maturity private debt with differentdegrees of default risk Different authorsmeasure this spread differently and thesedifferences are potentially importantBecause markets for private debt differ sub-stantially across countries and are mostdeveloped for the United States this workhas focused on the United States

In his study of the credit channel duringthe Great Depression Bernanke (1983)showed that during the interwar period theBaa-Treasury bond spread was a useful pre-dictor of industrial production growth Stock

796 Journal of Economic Literature Vol XLI (September 2003)

and Watson (1989) and Friedman andKuttner (1992) studied the default spread asa predictor of real growth in the postwarperiod they found that the spread betweencommercial paper and US Treasury bills ofthe same maturity (three or six months theldquopaper-billrdquo spread) was a potent predictorof output growth (monthly data 1959ndash88 forStock and Watson 1989 quarterly data1960ndash90 for Friedman and Kuttner 1992)Using in-sample statistics Friedman andKuttner (1992) concluded that upon con-trolling for the paper-bill spread monetaryaggregates and interest rates have little predictive content for real output a findingconfirmed by Bernanke and Blinder (1992)and Martin Feldstein and Stock (1994)

Subsequent literature focused on whetherthis predictive relationship is stable overtime Bernanke (1990) used in-sample statis-tics to confirm the strong performance ofpaper-bill spread as predictor of output butby splitting up the sample he also suggestedthat this relation weakened during the1980s This view was affirmed and assertedmore strongly by Mark Thoma and Jo AnnaGray (1994) Rik Hafer and Ali Kutan(1992) and Kenneth Emery (1996) Thomaand Gray (1994) for example found that thepaper-bill spread has strong in-sampleexplanatory power in recursive or rollingregressions but little predictive power inpseudo out-of-sample forecasting exercisesover the 1980s Emery (1996) finds little in-sample explanatory power of the paper-billspread in samples that postdate 1980 Theseauthors interpreted this as a consequence ofspecial events especially in 1973ndash74 whichcontribute to a good in-sample fit but notnecessarily good forecasting performanceDrawing on institutional considerationsJohn Duca (1999) also took this view Ducarsquos(1999) concerns echo Timothy Cookrsquos (1981)warnings about how the changing institu-tional environment and financial innovationscould substantially change markets for short-term debt and thereby alter the relationshipbetween default spreads and real activity

One obvious true out-of-sample predic-tive failure of the paper-bill spread is its fail-ure to rise sharply in advance of the 1990ndash91US recession In their post-mortemFriedman and Kuttner (1998) suggested thatthis predictive failure arose because the1990ndash91 recession was caused in large partby nonmonetary events that would not havebeen detected by the paper-bill spread Theyfurther argued that there were changes inthe commercial paper market unrelated tothe recession that also led to this predictivefailure Similarly the paper-bill spread failedto forecast the 2001 recession the paper-billspread had brief moderate spikes in October1998 October 1999 and June 2000 but thespread was small and declining from August2000 through the end of 2001 (Stock andWatson 2003b)

We know of little work examining the pre-dictive content of default spreads ineconomies other than the United StatesBernanke and Mishkin (1992) report a pre-liminary investigation but they questionedthe adequacy of their private debt interestrate data (the counterpart of the commercialpaper rate in the United States) for severalcountries Finding sufficiently long timeseries data on reliable market prices of suit-able private debt instruments has been abarrier to international comparisons on therole of the default spread

Some studies examined the predictivecontent of the default spread for inflationFriedman and Kuttner (1992) found littlepredictive content of the paper-bill spreadfor inflation using Granger causality testsConsistent with this Feldstein and Stock(1994) found that although the paper-billspread was a significant in-sample predictorof real GDP it did not significantly enterequations predicting nominal GDP

Four nonexclusive arguments have beenput forth on why the paperndashbill spread hadpredictive content for output growth duringthe 1960s and 1970s Stock and Watson(1989) suggested the predictive contentarises from expectations of default risk

Stock and Watson Forecasting Output and Inflation 797

which are in turn based on private expecta-tions of sales and profits Bernanke (1990)and Bernanke and Blinder (1992) arguedinstead that the paper-bill spread is a sensi-tive measure of monetary policy and this isthe main source of its predictive contentFriedman and Kuttner (1993ab) suggestedthat the spread is detecting influences ofsupply and demand (ie liquidity) in themarket for private debt this emphasis is sim-ilar to Cookrsquos (1981) attribution of move-ments in such spreads to supply and demandconsiderations Finally Thoma and Gray(1994) and Emery (1996) argued that thepredictive content is largely coincidental theconsequence of one-time events

There has been some examination of otherspreads in this literature Mark Gertler andLown (2000) take the view that because ofthe credit channel theory of monetary policytransmission the premise of using a defaultspread to predict future output is sound butthat the paper-bill spread is a flawed choicefor institutional reasons Instead they sug-gest using the high-yield bond (ldquojunkbondrdquo)ndashAaa spread instead The junk bondmarket was only developed in the 1980s inthe United States so this spread has a shorttime series Still Gertler and Lown (2000)present in-sample evidence that its explana-tory power was strong throughout this pe-riod This is notable because the paper-billspread (and as was noted above the termspread) have substantially reduced or no pre-dictive content for output growth in theUnited States during this period HoweverDucarsquos (1999) concerns about default spreadsin general extend to the junk bond-Aaaspread as well he suggests the spike in thejunk bond spread in the late 1980s and early1990s (which is key to this spreadrsquos signal ofthe 1990 recession) was a coincidental conse-quence of the aftermath of the thrift crisis inwhich thrifts were forced to sell their junkbond holdings in an illiquid market

Stock prices and dividend yields If theprice of a stock equals the expected dis-counted value of future earnings then stock

returns should be useful in forecasting earn-ings growth or more broadly output growthThe empirical link between stock prices andeconomic activity has been noted at leastsince Mitchell and Burns (1938) see StanleyFischer and Robert Merton (1984) andRobert Barro (1990) Upon closer inspec-tion however this link is murky Stockreturns generally do not have substantial in-sample predictive content for future outputeven in bivariate regressions with no laggeddependent variables (Fama 1981 and Harvey1989) and any predictive content is reducedby including lagged output growth Thisminimal marginal predictive content is foundboth in linear regressions predicting outputgrowth (Stock and Watson 1989 1999a) andin probit regressions of binary recessionevents (Estrella and Mishkin 1998)

In his review article Campbell (1999)shows that in a simple loglinear representa-tive agent model the log price-dividend ratioembodies rational discounted forecasts ofdividend growth rates and stock returnsmaking it an appropriate state variable to usefor forecasting But in his internationaldataset (fifteen countries sample periodsmainly 1970s to 1996 Campbell (1999)found that the log dividend price ratio has lit-tle predictive content for output This is con-sistent with the generally negative conclu-sions in the larger literature that examinesthe predictive content of stock returnsdirectly These generally negative findingsprovide a precise reprise of the witticism thatthe stock market has predicted nine of thelast four recessions

Campbell et al (2001) proposed an inter-esting variant in which the variance of stockreturns rather than the returns themselvescould have predictive content for outputgrowth Using in-sample statistics theyfound evidence that high volatility in onequarter signals low growth in the next quar-ter as it might if high volatility was associ-ated with increased doubts about short-termeconomic prospects When Hui Guo (2002)used out-of-sample statistics however the

798 Journal of Economic Literature Vol XLI (September 2003)

evidence for predictive content was substan-tially weaker These findings are consistentwith the predictive content of stock marketvolatility being stronger during someepisodes than during others

Few studies have examined the predictivecontent of stock prices for inflation One isCharles Goodhart and Boris Hofmann(2000a) who found that stock returns do nothave marginal predictive content for infla-tion in their international data set (twelvedeveloped economies quarterly data mainly1970ndash98 or shorter)

Other financial indicators Exchangerates are a channel through which inflationcan be imported in open economies In theUnited States exchange rates (or a measureof the terms of trade) have long entered con-ventional Phillips curves Robert Gordon(1982 1998) finds these exchange rates sta-tistically significant based on in-sample testsIn their international dataset howeverGoodhart and Hofmann (2000b) find thatpseudo out-of-sample forecasts of inflationusing exchange rates and lagged inflationoutperformed autoregressive forecasts inonly one or two of their seventeen countriesdepending on the horizon At least in theUS data there is also little evidence thatexchange rates predict output growth (egStock and Watson 1999a)

One problem with the nominal termstructure as a predictor of inflation is thatunder the expectations hypothesis the for-ward rate embodies forecasts of both infla-tion and future real rates In theory one caneliminate the expected future real rates byusing spreads between forward rates in theterm structures of nominal and real debt ofmatched maturity and matched bearer riskIn practice one of the few cases for whichthis is possible with time series of a reason-able length is for British index-linked bondsDavid Barr and Campbell (1997) investi-gated the (bivariate in-sample) predictivecontent of these implicit inflation expecta-tions and found that they had better predic-tive content for inflation than forward rates

obtained solely from the nominal term struc-ture They provided no evidence on Grangercausality or marginal predictive content ofthese implicit inflation expectations in multi-variate regressions

Lettau and Sydney Ludvigson (2001) pro-posed a novel indicator the log of the consumption-wealth ratio They argue thatin a representative consumer model with nostickiness in consumption the log ratio ofconsumption to total wealth (human andnonhuman) should predict the return on themarket portfolio They find empirically thattheir version of the consumptionndashwealthratio (a cointegrating residual between con-sumption of nondurables financial wealthand labor income all in logarithms) has pre-dictive content for multiyear stock returns(both real returns and excess returns) Ifconsumption is sticky it could also have pre-dictive content for consumption growthHowever Ludvigson and Charles Steindel(1999) and Lettau and Ludvigson (2000)find that this indicator does not predict con-sumption growth or income growth in theUnited States one quarter ahead

Housing constitutes a large component ofaggregate wealth and gets significant weightin the CPI in many countries More gener-ally housing is a volatile and cyclically sensi-tive sector and measures of real activity inthe housing sector are known to be usefulleading indicators of economic activity atleast in the United States (Stock and Watson1989 1999a) suggesting a broader channelby which housing prices might forecast realactivity inflation or both In the UnitedStates housing starts (a real quantity mea-sure) have some predictive content for infla-tion (Stock 1998 Stock and Watson 1999b)Studies of the predictive content of housingprices confront difficult data problems how-ever Goodhart and Hofmann (2000a) con-structed a housing price data set for twelveOECD countries (extended to seventeencountries in Goodhart and Hofmann(2000b) They found that residential housinginflation has significant in-sample marginal

Stock and Watson Forecasting Output and Inflation 799

predictive content for overall inflation in afew of the several countries they studyalthough in several countries they usedinterpolated annual data which makes thoseresults difficult to assess

Nonlinear models The foregoing discus-sion has focused on statistical models inwhich the forecasts are linear functions of thepredictors even the recession predictionmodels estimated using probit regressionsare essentially linear in the sense that the pre-dictive index (the argument of the probitfunction) is a linear function of the predic-tors Might the problems of instability andepisodically poor predictive content stemfrom inherent nonlinearities in the forecast-ing relation The evidence on this proposi-tion is limited and mixed Ted Jaditz LeighRiddick and Chera Sayers (1998) examinelinear and nonlinear models of US indus-trial production using asset price predictorsand conclude that combined nonlinear fore-casts improve upon simple linear modelsadditionally Greg Tkacz (2001) reportsimprovements of nonlinear models over lin-ear models for forecasting Canadian GDPOn the other hand John Galbraith and Tkacz(2000) find limited international evidence ofnonlinearity in the outputndashterm spread rela-tion Similarly in their pseudo out-of-samplecomparison of VAR to multivariate neuralnetwork forecasts Norman Swanson andHalbert White (1997) concluded that the lin-ear forecasts generally performed better forvarious measures of US economic activityand inflation Swanson and White (1995)reached similar conclusions when forecastinginterest rates Given this limited evidence wecannot rule out the possibility that the rightnonlinear model will produce stable and reli-able forecasts of output and inflation usinginterest rates but this ldquorightrdquo nonlinearmodel has yet to be found

32 Forecasts Using Nonfinancial Variables

The literature on forecasting output andinflation with nonfinancial variables is mas-

sive see Stock and Watson (1999a) for anextensive review of the US evidence Thissection highlights a few recent studies onthis topic

The use of nonfinancial variables to forecastinflation has to a large extent focused onidentifying suitable measures of output gapsthat is estimating generalized Phillips curvesIn the United States the unemployment-based Phillips curve with a constant non-accelerating inflation rate of unemployment(NAIRU) has recently been unstable predict-ing accelerating inflation during a time thatinflation was in fact low and steady or fallingThis instability has been widely documentedsee for example Gordon (1997 1998) andDouglas Staiger Stock and Watson (1997ab2001) One reaction to this instability has beento suggest that the NAIRU was falling in theUnited States during the 1990s Mechanicallythis keeps the unemployment-based Phillipscurve on track and it makes sense in the con-text of changes in the US labor market and inthe economy generally cf Lawrence Katzand Alan Krueger (1999) However an impre-cisely estimated time-varying NAIRU makesforecasting using the unemployment-basedPhillips curve problematic

A different reaction to this time variation inthe NAIRU has been to see if there are alter-native predictive relations that have beenmore stable Staiger Stock and Watson(1997a) considered 71 candidate leadingindicators of inflation both financial and non-financial (quarterly US) and in a similar butmore thorough exercise Stock and Watson(1999b) considered 167 candidate leadingindicators (monthly US) They found a fewindicators that have been stable predictors ofinflation the prime example being the capac-ity utilization rate Gordon (1998) and Stock(1998) confirmed the accuracy of recent USinflation forecasts based on the capacity uti-lization rate Stock and Watson (1999b) alsosuggested an alternative Phillips-curve-typeforecast based on a single aggregate activityindex computed using 85 individual measuresof real aggregate activity

800 Journal of Economic Literature Vol XLI (September 2003)

Recently Andrew Atkeson and LeeOhanian (2001) challenged the usefulness ofall inflation forecasts based on the Phillipscurve and its variants They showed that forthe United States from 1984 to 2001 pub-lished inflation forecasts and pseudo out-of-sample Phillips curve forecasts did not beata seasonal random walk forecast of annualinflation Their finding poses a significantchallenge to all attempts to forecast inflationand we return to it in our empirical analysis

The international evidence on the suitabil-ity of output gaps and the Phillips Curve forforecasting inflation is mixed Simple unem-ployment-based models with a constantNAIRU fail in Europe which is one way tostate the phenomenon of so-called hysteresisin the unemployment rate More sophisti-cated and flexible statistical tools for esti-mating the NAIRU can improve in-samplefits for the European data (Thomas Laubach2001) but their value for forecasting is ques-tionable because of imprecision in the esti-mated NAIRU at the end of the sampleSimilarly inflation forecasts based on an out-put gap rather than the unemployment rateface the practical problem of estimating thegap at the end of the sample which neces-sarily introduces a one-sided estimate andassociated imprecision Evidence inMassimiliano Marcellino Stock and Watson(2000) suggests that the ability of output gapmodels to forecast inflation in Europe ismore limited than in the United States

Finally there is evidence (from US data)that the inflation process itself as well as pre-dictive relations based on it is time-varyingWilliam Brainard and George Perry (2000)and Timothy Cogley and Thomas Sargent(2001 2002) suggested that the persistence ofUS inflation was high in the 1970s and 1980sbut subsequently declined although this con-clusion appears to be sensitive to the methodused to measure persistence (FredericPivetta and Ricardo Reis 2002) GeorgeAkerlof William Dickens and Perry (2000)provided a model based on near-rationalbehavior that motivates a nonlinear Phillips

curve which they interpreted as consistentwith the Brainard and Perry (2000) evidence

In a similar vein Stephen Cecchetti RitaChu and Charles Steindel (2000) performeda pseudo out-of-sample forecasting experi-ment on various candidate leading indicatorsof inflation from 1985 to 1998 in the UnitedStates including interest rates term anddefault spreads and several nonfinancialindicators They concluded that none ofthese indicators financial or nonfinancialreliably predicts inflation in bivariate fore-casting models and that there are very fewyears in which financial variables outperforma simple autoregression Because theyassessed performance on a year-by-yearbasis these findings have great samplingvariability and it is difficult to know howmuch of this is due to true instability Stilltheir findings are consistent with Stock andWatsonrsquos (1996) results based on formal sta-bility tests that time variation in thesereduced form bivariate predictive relationsis widespread in the US data

33 Discussion

An econometrician might quibble withsome aspects of this literature Many ofthese studies fail to include lagged endoge-nous variables and thus do not assess mar-ginal predictive content Results oftenchange when marginal predictive content isconsidered (the predictive content of theterm spread for inflation is an example)Many of the regressions involve overlappingreturns and when the overlap period is largerelative to the sample size the distribution ofin-sample t-statistics and R2s becomes non-standard Some regressors such as the divi-dend yield and the term spread are highlypersistent and even if they do not have aunit root this persistence causes conventionalinference methods to break down These latter two problems combined make it evenmore difficult to do reliable inference andvery few of these studies mention far lesstackle either of these difficulties with their

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 7: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

794 Journal of Economic Literature Vol XLI (September 2003)

term spread predicted an imminent reces-sion in the United States Harvey (1989) pub-lished a forecast based on the yield curve thatsuggested ldquoa slowing of economic growth butnot zero or negative growthrdquo from the thirdquarter of 1989 through the third quarter of1990 and the Stock-Watson (1989) experi-mental recession index increased sharplywhen the yield curve flattened in late 1988and early 1989 However the business cyclepeak of July 1990 considerably postdates thepredicted period of these slowdowns asLaurent (1989) wrote ldquorecent spread datasuggest that the slowdown is likely to extendthrough the rest of 1989 and be quite signifi-cantrdquo Moreover Laurentrsquos (1989) forecastwas based in part on a judgmental interpreta-tion that the then-current inversion of theyield curve had special (nonlinear) signifi-cance signaling a downturn more severethan would be suggested by a linear modelIndeed even the largest predicted recessionprobabilities from the in-sample models aremodest 25 percent in Estrella and Mishkinrsquos(1998) probit model and 20 percent inDuekerrsquos (1997) Markov switching model forexample Harvey (1993) interpreted this evi-dence more favorably arguing that becausethe yield curve inverted moderately begin-ning in 1989II it correctly predicted a mod-erate recession six quarters later Our inter-pretation of this episode is that the termspread is an indicator of monetary policy thatmonetary policy was tight during late 1988and that yield-curve based models correctlypredicted a slowdown in 1989 This slow-down was not a recession however and theproximate cause of the recession of 1990 wasnot monetary tightening but rather specialnonmonetary circumstances in particular theinvasion of Kuwait by Iraq and the subse-quent response by US consumers (OlivierBlanchard 1993) This interpretation isbroadly similar to Benjamin Friedman andKenneth Kuttnerrsquos (1998) explanation of thefailure of the paper-bill spread to predict the1990 recession (discussed below)

Stock and Watson (2003b) examine the

behavior of various leading indicators beforeand during the US recession that began inMarch 2001 The term spread did turn neg-ative in advance of this recession the Fedfunds rate exceeded the long bond rate fromJune 2000 through March 2001 This inver-sion however was small by historical stan-dards While regressions of the form (3) pre-dicted a slower rate of economic growth inearly 2001 the predicted slowdown wasmodest four quarter growth forecasts basedon (3) fell 14 percentage points from 33percent in 2000I to a minimum of 19 per-cent in 2000IV still far from the negativegrowth of a recession

One way to get additional evidence on thereliability of the term spread as a predictor ofoutput growth is to examine evidence forother countries Harvey (1991) Zuliu Hu(1993) E Philip Davis and S G B Henry(1994) Charles Plosser and K GeertRouwenhorst (1994) Catherine Bonser-Nealand Timothy Morley (1997) Sharon Kozicki(1997) John Campbell (1999) Estrella andMishkin (1997) Estrella Anthony Rodriguesand Sebastian Schich (2003) and JosephAtta-Mensah and Greg Tkacz (2001) gener-ally conclude that the term spread has pre-dictive content for real output growth inmajor non-US OECD economies EstrellaRodrigues and Schich (2003) use in-samplebreak tests to assess coefficient stability of theforecasting relations and typically fail to rejectthe null hypothesis of stability in the cases inwhich the term spread has the greatest esti-mated predictive content (mainly long-horizon regressions) Additionally HenriBernard and Stefan Gerlach (1998) andEstrella Rodrigues and Schich (2003) pro-vide cross-country evidence on term spreadsas predictors of a binary recession indicatorfor seven non-US OECD countries Unlikemost of these papers Plosser andRouwenhorst (1994) considered multipleregressions that include the level and changeof interest rates and concluded that given thespread the short rate has little predictive con-tent for output in almost all the economies

Stock and Watson Forecasting Output and Inflation 795

they consider These studies typically used in-sample statistics and data sets that start in1970 or later Three exceptions to this gener-ally sanguine view are Davis and GabrielFagan (1997) Smets and Tsatsaronis (1997)and Fabio Canova and Gianni De Nicolo(2000) Using a pseudo out-of-sample fore-casting design Davis and Fagan (1997) findevidence of subsample instability and reportdisappointing pseudo out-of-sample forecast-ing performance across nine EU economiesSmets and Tsatsaronis (1997) find instabilityin the yield curvendashoutput relation in the1990s in the United States and GermanyCanova and De Nicolo (2000) using in-sample VAR statistics find only a limitedforecasting role for innovations to the termpremium in Germany Japan and theUnited Kingdom

Term spreads and inflation Many stud-ies including some of those already citedalso consider the predictive content of theterm spread for inflation According to theexpectations hypothesis of the term struc-ture of interest rates the forward rate (andthe term spread) should embody marketexpectations of future inflation and thefuture real rate With some notable excep-tions the papers in this literature generallyfind that there is little or no marginal infor-mation content in the nominal interest rateterm structure for future inflation Much ofthe early work which typically claims to findpredictive content did not control for laggedinflation In US data Mishkin (1990a)found no predictive content of term spreadsfor inflation at the short end of the yieldcurve although Mishkin (1990b) found pre-dictive content using spreads that involvelong bond rates Philippe Jorion and Mishkin(1991) and Mishkin (1991) reached similarconclusions using data on ten OECD coun-tries results confirmed by Gerlach (1997)for Germany using Mishkinrsquos methodologyDrawing on Jeffrey Frankelrsquos (1982) earlywork in this area Frankel and Cara Lown(1994) suggested a modification of the termspread based on a weighted average of dif-

ferent maturities that outperformed the sim-ple term spread in Mishkin-style regressions

Mishkinrsquos regressions have a single sto-chastic regressor the term spread (no lags)and in particular do not include lagged infla-tion Inflation is highly persistent howeverand Bernanke and Mishkin (1992) Estrellaand Mishkin (1997) and Kozicki (1997)examined the in-sample marginal predictivecontent of the term spread given laggedinflation Bernanke and Mishkin (1992)found little or no marginal predictive con-tent of the term spread for one-month-ahead inflation in a data set with six largeeconomies once lags of inflation are includ-ed Kozicki (1997) and Estrella and Mishkin(1997) included only a single lag of inflationbut even so they found that doing so sub-stantially reduced the marginal predictivecontent of the term spread for future infla-tion over one to two years For exampleonce lagged inflation is added Kozicki(1997) found that the spread remained sig-nificant for one-year inflation in only two ofthe ten OECD countries she studied inEstrella and Mishkinrsquos (1997) study the termspread was no longer a significant predictorat the one-year horizon in any of their fourcountries although they provided evidencefor predictive content at longer horizons

Default spreads Another strand ofresearch has focused on the predictive con-tent of default spreads primarily for realeconomic activity A default spread is the dif-ference between the interest rates onmatched maturity private debt with differentdegrees of default risk Different authorsmeasure this spread differently and thesedifferences are potentially importantBecause markets for private debt differ sub-stantially across countries and are mostdeveloped for the United States this workhas focused on the United States

In his study of the credit channel duringthe Great Depression Bernanke (1983)showed that during the interwar period theBaa-Treasury bond spread was a useful pre-dictor of industrial production growth Stock

796 Journal of Economic Literature Vol XLI (September 2003)

and Watson (1989) and Friedman andKuttner (1992) studied the default spread asa predictor of real growth in the postwarperiod they found that the spread betweencommercial paper and US Treasury bills ofthe same maturity (three or six months theldquopaper-billrdquo spread) was a potent predictorof output growth (monthly data 1959ndash88 forStock and Watson 1989 quarterly data1960ndash90 for Friedman and Kuttner 1992)Using in-sample statistics Friedman andKuttner (1992) concluded that upon con-trolling for the paper-bill spread monetaryaggregates and interest rates have little predictive content for real output a findingconfirmed by Bernanke and Blinder (1992)and Martin Feldstein and Stock (1994)

Subsequent literature focused on whetherthis predictive relationship is stable overtime Bernanke (1990) used in-sample statis-tics to confirm the strong performance ofpaper-bill spread as predictor of output butby splitting up the sample he also suggestedthat this relation weakened during the1980s This view was affirmed and assertedmore strongly by Mark Thoma and Jo AnnaGray (1994) Rik Hafer and Ali Kutan(1992) and Kenneth Emery (1996) Thomaand Gray (1994) for example found that thepaper-bill spread has strong in-sampleexplanatory power in recursive or rollingregressions but little predictive power inpseudo out-of-sample forecasting exercisesover the 1980s Emery (1996) finds little in-sample explanatory power of the paper-billspread in samples that postdate 1980 Theseauthors interpreted this as a consequence ofspecial events especially in 1973ndash74 whichcontribute to a good in-sample fit but notnecessarily good forecasting performanceDrawing on institutional considerationsJohn Duca (1999) also took this view Ducarsquos(1999) concerns echo Timothy Cookrsquos (1981)warnings about how the changing institu-tional environment and financial innovationscould substantially change markets for short-term debt and thereby alter the relationshipbetween default spreads and real activity

One obvious true out-of-sample predic-tive failure of the paper-bill spread is its fail-ure to rise sharply in advance of the 1990ndash91US recession In their post-mortemFriedman and Kuttner (1998) suggested thatthis predictive failure arose because the1990ndash91 recession was caused in large partby nonmonetary events that would not havebeen detected by the paper-bill spread Theyfurther argued that there were changes inthe commercial paper market unrelated tothe recession that also led to this predictivefailure Similarly the paper-bill spread failedto forecast the 2001 recession the paper-billspread had brief moderate spikes in October1998 October 1999 and June 2000 but thespread was small and declining from August2000 through the end of 2001 (Stock andWatson 2003b)

We know of little work examining the pre-dictive content of default spreads ineconomies other than the United StatesBernanke and Mishkin (1992) report a pre-liminary investigation but they questionedthe adequacy of their private debt interestrate data (the counterpart of the commercialpaper rate in the United States) for severalcountries Finding sufficiently long timeseries data on reliable market prices of suit-able private debt instruments has been abarrier to international comparisons on therole of the default spread

Some studies examined the predictivecontent of the default spread for inflationFriedman and Kuttner (1992) found littlepredictive content of the paper-bill spreadfor inflation using Granger causality testsConsistent with this Feldstein and Stock(1994) found that although the paper-billspread was a significant in-sample predictorof real GDP it did not significantly enterequations predicting nominal GDP

Four nonexclusive arguments have beenput forth on why the paperndashbill spread hadpredictive content for output growth duringthe 1960s and 1970s Stock and Watson(1989) suggested the predictive contentarises from expectations of default risk

Stock and Watson Forecasting Output and Inflation 797

which are in turn based on private expecta-tions of sales and profits Bernanke (1990)and Bernanke and Blinder (1992) arguedinstead that the paper-bill spread is a sensi-tive measure of monetary policy and this isthe main source of its predictive contentFriedman and Kuttner (1993ab) suggestedthat the spread is detecting influences ofsupply and demand (ie liquidity) in themarket for private debt this emphasis is sim-ilar to Cookrsquos (1981) attribution of move-ments in such spreads to supply and demandconsiderations Finally Thoma and Gray(1994) and Emery (1996) argued that thepredictive content is largely coincidental theconsequence of one-time events

There has been some examination of otherspreads in this literature Mark Gertler andLown (2000) take the view that because ofthe credit channel theory of monetary policytransmission the premise of using a defaultspread to predict future output is sound butthat the paper-bill spread is a flawed choicefor institutional reasons Instead they sug-gest using the high-yield bond (ldquojunkbondrdquo)ndashAaa spread instead The junk bondmarket was only developed in the 1980s inthe United States so this spread has a shorttime series Still Gertler and Lown (2000)present in-sample evidence that its explana-tory power was strong throughout this pe-riod This is notable because the paper-billspread (and as was noted above the termspread) have substantially reduced or no pre-dictive content for output growth in theUnited States during this period HoweverDucarsquos (1999) concerns about default spreadsin general extend to the junk bond-Aaaspread as well he suggests the spike in thejunk bond spread in the late 1980s and early1990s (which is key to this spreadrsquos signal ofthe 1990 recession) was a coincidental conse-quence of the aftermath of the thrift crisis inwhich thrifts were forced to sell their junkbond holdings in an illiquid market

Stock prices and dividend yields If theprice of a stock equals the expected dis-counted value of future earnings then stock

returns should be useful in forecasting earn-ings growth or more broadly output growthThe empirical link between stock prices andeconomic activity has been noted at leastsince Mitchell and Burns (1938) see StanleyFischer and Robert Merton (1984) andRobert Barro (1990) Upon closer inspec-tion however this link is murky Stockreturns generally do not have substantial in-sample predictive content for future outputeven in bivariate regressions with no laggeddependent variables (Fama 1981 and Harvey1989) and any predictive content is reducedby including lagged output growth Thisminimal marginal predictive content is foundboth in linear regressions predicting outputgrowth (Stock and Watson 1989 1999a) andin probit regressions of binary recessionevents (Estrella and Mishkin 1998)

In his review article Campbell (1999)shows that in a simple loglinear representa-tive agent model the log price-dividend ratioembodies rational discounted forecasts ofdividend growth rates and stock returnsmaking it an appropriate state variable to usefor forecasting But in his internationaldataset (fifteen countries sample periodsmainly 1970s to 1996 Campbell (1999)found that the log dividend price ratio has lit-tle predictive content for output This is con-sistent with the generally negative conclu-sions in the larger literature that examinesthe predictive content of stock returnsdirectly These generally negative findingsprovide a precise reprise of the witticism thatthe stock market has predicted nine of thelast four recessions

Campbell et al (2001) proposed an inter-esting variant in which the variance of stockreturns rather than the returns themselvescould have predictive content for outputgrowth Using in-sample statistics theyfound evidence that high volatility in onequarter signals low growth in the next quar-ter as it might if high volatility was associ-ated with increased doubts about short-termeconomic prospects When Hui Guo (2002)used out-of-sample statistics however the

798 Journal of Economic Literature Vol XLI (September 2003)

evidence for predictive content was substan-tially weaker These findings are consistentwith the predictive content of stock marketvolatility being stronger during someepisodes than during others

Few studies have examined the predictivecontent of stock prices for inflation One isCharles Goodhart and Boris Hofmann(2000a) who found that stock returns do nothave marginal predictive content for infla-tion in their international data set (twelvedeveloped economies quarterly data mainly1970ndash98 or shorter)

Other financial indicators Exchangerates are a channel through which inflationcan be imported in open economies In theUnited States exchange rates (or a measureof the terms of trade) have long entered con-ventional Phillips curves Robert Gordon(1982 1998) finds these exchange rates sta-tistically significant based on in-sample testsIn their international dataset howeverGoodhart and Hofmann (2000b) find thatpseudo out-of-sample forecasts of inflationusing exchange rates and lagged inflationoutperformed autoregressive forecasts inonly one or two of their seventeen countriesdepending on the horizon At least in theUS data there is also little evidence thatexchange rates predict output growth (egStock and Watson 1999a)

One problem with the nominal termstructure as a predictor of inflation is thatunder the expectations hypothesis the for-ward rate embodies forecasts of both infla-tion and future real rates In theory one caneliminate the expected future real rates byusing spreads between forward rates in theterm structures of nominal and real debt ofmatched maturity and matched bearer riskIn practice one of the few cases for whichthis is possible with time series of a reason-able length is for British index-linked bondsDavid Barr and Campbell (1997) investi-gated the (bivariate in-sample) predictivecontent of these implicit inflation expecta-tions and found that they had better predic-tive content for inflation than forward rates

obtained solely from the nominal term struc-ture They provided no evidence on Grangercausality or marginal predictive content ofthese implicit inflation expectations in multi-variate regressions

Lettau and Sydney Ludvigson (2001) pro-posed a novel indicator the log of the consumption-wealth ratio They argue thatin a representative consumer model with nostickiness in consumption the log ratio ofconsumption to total wealth (human andnonhuman) should predict the return on themarket portfolio They find empirically thattheir version of the consumptionndashwealthratio (a cointegrating residual between con-sumption of nondurables financial wealthand labor income all in logarithms) has pre-dictive content for multiyear stock returns(both real returns and excess returns) Ifconsumption is sticky it could also have pre-dictive content for consumption growthHowever Ludvigson and Charles Steindel(1999) and Lettau and Ludvigson (2000)find that this indicator does not predict con-sumption growth or income growth in theUnited States one quarter ahead

Housing constitutes a large component ofaggregate wealth and gets significant weightin the CPI in many countries More gener-ally housing is a volatile and cyclically sensi-tive sector and measures of real activity inthe housing sector are known to be usefulleading indicators of economic activity atleast in the United States (Stock and Watson1989 1999a) suggesting a broader channelby which housing prices might forecast realactivity inflation or both In the UnitedStates housing starts (a real quantity mea-sure) have some predictive content for infla-tion (Stock 1998 Stock and Watson 1999b)Studies of the predictive content of housingprices confront difficult data problems how-ever Goodhart and Hofmann (2000a) con-structed a housing price data set for twelveOECD countries (extended to seventeencountries in Goodhart and Hofmann(2000b) They found that residential housinginflation has significant in-sample marginal

Stock and Watson Forecasting Output and Inflation 799

predictive content for overall inflation in afew of the several countries they studyalthough in several countries they usedinterpolated annual data which makes thoseresults difficult to assess

Nonlinear models The foregoing discus-sion has focused on statistical models inwhich the forecasts are linear functions of thepredictors even the recession predictionmodels estimated using probit regressionsare essentially linear in the sense that the pre-dictive index (the argument of the probitfunction) is a linear function of the predic-tors Might the problems of instability andepisodically poor predictive content stemfrom inherent nonlinearities in the forecast-ing relation The evidence on this proposi-tion is limited and mixed Ted Jaditz LeighRiddick and Chera Sayers (1998) examinelinear and nonlinear models of US indus-trial production using asset price predictorsand conclude that combined nonlinear fore-casts improve upon simple linear modelsadditionally Greg Tkacz (2001) reportsimprovements of nonlinear models over lin-ear models for forecasting Canadian GDPOn the other hand John Galbraith and Tkacz(2000) find limited international evidence ofnonlinearity in the outputndashterm spread rela-tion Similarly in their pseudo out-of-samplecomparison of VAR to multivariate neuralnetwork forecasts Norman Swanson andHalbert White (1997) concluded that the lin-ear forecasts generally performed better forvarious measures of US economic activityand inflation Swanson and White (1995)reached similar conclusions when forecastinginterest rates Given this limited evidence wecannot rule out the possibility that the rightnonlinear model will produce stable and reli-able forecasts of output and inflation usinginterest rates but this ldquorightrdquo nonlinearmodel has yet to be found

32 Forecasts Using Nonfinancial Variables

The literature on forecasting output andinflation with nonfinancial variables is mas-

sive see Stock and Watson (1999a) for anextensive review of the US evidence Thissection highlights a few recent studies onthis topic

The use of nonfinancial variables to forecastinflation has to a large extent focused onidentifying suitable measures of output gapsthat is estimating generalized Phillips curvesIn the United States the unemployment-based Phillips curve with a constant non-accelerating inflation rate of unemployment(NAIRU) has recently been unstable predict-ing accelerating inflation during a time thatinflation was in fact low and steady or fallingThis instability has been widely documentedsee for example Gordon (1997 1998) andDouglas Staiger Stock and Watson (1997ab2001) One reaction to this instability has beento suggest that the NAIRU was falling in theUnited States during the 1990s Mechanicallythis keeps the unemployment-based Phillipscurve on track and it makes sense in the con-text of changes in the US labor market and inthe economy generally cf Lawrence Katzand Alan Krueger (1999) However an impre-cisely estimated time-varying NAIRU makesforecasting using the unemployment-basedPhillips curve problematic

A different reaction to this time variation inthe NAIRU has been to see if there are alter-native predictive relations that have beenmore stable Staiger Stock and Watson(1997a) considered 71 candidate leadingindicators of inflation both financial and non-financial (quarterly US) and in a similar butmore thorough exercise Stock and Watson(1999b) considered 167 candidate leadingindicators (monthly US) They found a fewindicators that have been stable predictors ofinflation the prime example being the capac-ity utilization rate Gordon (1998) and Stock(1998) confirmed the accuracy of recent USinflation forecasts based on the capacity uti-lization rate Stock and Watson (1999b) alsosuggested an alternative Phillips-curve-typeforecast based on a single aggregate activityindex computed using 85 individual measuresof real aggregate activity

800 Journal of Economic Literature Vol XLI (September 2003)

Recently Andrew Atkeson and LeeOhanian (2001) challenged the usefulness ofall inflation forecasts based on the Phillipscurve and its variants They showed that forthe United States from 1984 to 2001 pub-lished inflation forecasts and pseudo out-of-sample Phillips curve forecasts did not beata seasonal random walk forecast of annualinflation Their finding poses a significantchallenge to all attempts to forecast inflationand we return to it in our empirical analysis

The international evidence on the suitabil-ity of output gaps and the Phillips Curve forforecasting inflation is mixed Simple unem-ployment-based models with a constantNAIRU fail in Europe which is one way tostate the phenomenon of so-called hysteresisin the unemployment rate More sophisti-cated and flexible statistical tools for esti-mating the NAIRU can improve in-samplefits for the European data (Thomas Laubach2001) but their value for forecasting is ques-tionable because of imprecision in the esti-mated NAIRU at the end of the sampleSimilarly inflation forecasts based on an out-put gap rather than the unemployment rateface the practical problem of estimating thegap at the end of the sample which neces-sarily introduces a one-sided estimate andassociated imprecision Evidence inMassimiliano Marcellino Stock and Watson(2000) suggests that the ability of output gapmodels to forecast inflation in Europe ismore limited than in the United States

Finally there is evidence (from US data)that the inflation process itself as well as pre-dictive relations based on it is time-varyingWilliam Brainard and George Perry (2000)and Timothy Cogley and Thomas Sargent(2001 2002) suggested that the persistence ofUS inflation was high in the 1970s and 1980sbut subsequently declined although this con-clusion appears to be sensitive to the methodused to measure persistence (FredericPivetta and Ricardo Reis 2002) GeorgeAkerlof William Dickens and Perry (2000)provided a model based on near-rationalbehavior that motivates a nonlinear Phillips

curve which they interpreted as consistentwith the Brainard and Perry (2000) evidence

In a similar vein Stephen Cecchetti RitaChu and Charles Steindel (2000) performeda pseudo out-of-sample forecasting experi-ment on various candidate leading indicatorsof inflation from 1985 to 1998 in the UnitedStates including interest rates term anddefault spreads and several nonfinancialindicators They concluded that none ofthese indicators financial or nonfinancialreliably predicts inflation in bivariate fore-casting models and that there are very fewyears in which financial variables outperforma simple autoregression Because theyassessed performance on a year-by-yearbasis these findings have great samplingvariability and it is difficult to know howmuch of this is due to true instability Stilltheir findings are consistent with Stock andWatsonrsquos (1996) results based on formal sta-bility tests that time variation in thesereduced form bivariate predictive relationsis widespread in the US data

33 Discussion

An econometrician might quibble withsome aspects of this literature Many ofthese studies fail to include lagged endoge-nous variables and thus do not assess mar-ginal predictive content Results oftenchange when marginal predictive content isconsidered (the predictive content of theterm spread for inflation is an example)Many of the regressions involve overlappingreturns and when the overlap period is largerelative to the sample size the distribution ofin-sample t-statistics and R2s becomes non-standard Some regressors such as the divi-dend yield and the term spread are highlypersistent and even if they do not have aunit root this persistence causes conventionalinference methods to break down These latter two problems combined make it evenmore difficult to do reliable inference andvery few of these studies mention far lesstackle either of these difficulties with their

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 8: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 795

they consider These studies typically used in-sample statistics and data sets that start in1970 or later Three exceptions to this gener-ally sanguine view are Davis and GabrielFagan (1997) Smets and Tsatsaronis (1997)and Fabio Canova and Gianni De Nicolo(2000) Using a pseudo out-of-sample fore-casting design Davis and Fagan (1997) findevidence of subsample instability and reportdisappointing pseudo out-of-sample forecast-ing performance across nine EU economiesSmets and Tsatsaronis (1997) find instabilityin the yield curvendashoutput relation in the1990s in the United States and GermanyCanova and De Nicolo (2000) using in-sample VAR statistics find only a limitedforecasting role for innovations to the termpremium in Germany Japan and theUnited Kingdom

Term spreads and inflation Many stud-ies including some of those already citedalso consider the predictive content of theterm spread for inflation According to theexpectations hypothesis of the term struc-ture of interest rates the forward rate (andthe term spread) should embody marketexpectations of future inflation and thefuture real rate With some notable excep-tions the papers in this literature generallyfind that there is little or no marginal infor-mation content in the nominal interest rateterm structure for future inflation Much ofthe early work which typically claims to findpredictive content did not control for laggedinflation In US data Mishkin (1990a)found no predictive content of term spreadsfor inflation at the short end of the yieldcurve although Mishkin (1990b) found pre-dictive content using spreads that involvelong bond rates Philippe Jorion and Mishkin(1991) and Mishkin (1991) reached similarconclusions using data on ten OECD coun-tries results confirmed by Gerlach (1997)for Germany using Mishkinrsquos methodologyDrawing on Jeffrey Frankelrsquos (1982) earlywork in this area Frankel and Cara Lown(1994) suggested a modification of the termspread based on a weighted average of dif-

ferent maturities that outperformed the sim-ple term spread in Mishkin-style regressions

Mishkinrsquos regressions have a single sto-chastic regressor the term spread (no lags)and in particular do not include lagged infla-tion Inflation is highly persistent howeverand Bernanke and Mishkin (1992) Estrellaand Mishkin (1997) and Kozicki (1997)examined the in-sample marginal predictivecontent of the term spread given laggedinflation Bernanke and Mishkin (1992)found little or no marginal predictive con-tent of the term spread for one-month-ahead inflation in a data set with six largeeconomies once lags of inflation are includ-ed Kozicki (1997) and Estrella and Mishkin(1997) included only a single lag of inflationbut even so they found that doing so sub-stantially reduced the marginal predictivecontent of the term spread for future infla-tion over one to two years For exampleonce lagged inflation is added Kozicki(1997) found that the spread remained sig-nificant for one-year inflation in only two ofthe ten OECD countries she studied inEstrella and Mishkinrsquos (1997) study the termspread was no longer a significant predictorat the one-year horizon in any of their fourcountries although they provided evidencefor predictive content at longer horizons

Default spreads Another strand ofresearch has focused on the predictive con-tent of default spreads primarily for realeconomic activity A default spread is the dif-ference between the interest rates onmatched maturity private debt with differentdegrees of default risk Different authorsmeasure this spread differently and thesedifferences are potentially importantBecause markets for private debt differ sub-stantially across countries and are mostdeveloped for the United States this workhas focused on the United States

In his study of the credit channel duringthe Great Depression Bernanke (1983)showed that during the interwar period theBaa-Treasury bond spread was a useful pre-dictor of industrial production growth Stock

796 Journal of Economic Literature Vol XLI (September 2003)

and Watson (1989) and Friedman andKuttner (1992) studied the default spread asa predictor of real growth in the postwarperiod they found that the spread betweencommercial paper and US Treasury bills ofthe same maturity (three or six months theldquopaper-billrdquo spread) was a potent predictorof output growth (monthly data 1959ndash88 forStock and Watson 1989 quarterly data1960ndash90 for Friedman and Kuttner 1992)Using in-sample statistics Friedman andKuttner (1992) concluded that upon con-trolling for the paper-bill spread monetaryaggregates and interest rates have little predictive content for real output a findingconfirmed by Bernanke and Blinder (1992)and Martin Feldstein and Stock (1994)

Subsequent literature focused on whetherthis predictive relationship is stable overtime Bernanke (1990) used in-sample statis-tics to confirm the strong performance ofpaper-bill spread as predictor of output butby splitting up the sample he also suggestedthat this relation weakened during the1980s This view was affirmed and assertedmore strongly by Mark Thoma and Jo AnnaGray (1994) Rik Hafer and Ali Kutan(1992) and Kenneth Emery (1996) Thomaand Gray (1994) for example found that thepaper-bill spread has strong in-sampleexplanatory power in recursive or rollingregressions but little predictive power inpseudo out-of-sample forecasting exercisesover the 1980s Emery (1996) finds little in-sample explanatory power of the paper-billspread in samples that postdate 1980 Theseauthors interpreted this as a consequence ofspecial events especially in 1973ndash74 whichcontribute to a good in-sample fit but notnecessarily good forecasting performanceDrawing on institutional considerationsJohn Duca (1999) also took this view Ducarsquos(1999) concerns echo Timothy Cookrsquos (1981)warnings about how the changing institu-tional environment and financial innovationscould substantially change markets for short-term debt and thereby alter the relationshipbetween default spreads and real activity

One obvious true out-of-sample predic-tive failure of the paper-bill spread is its fail-ure to rise sharply in advance of the 1990ndash91US recession In their post-mortemFriedman and Kuttner (1998) suggested thatthis predictive failure arose because the1990ndash91 recession was caused in large partby nonmonetary events that would not havebeen detected by the paper-bill spread Theyfurther argued that there were changes inthe commercial paper market unrelated tothe recession that also led to this predictivefailure Similarly the paper-bill spread failedto forecast the 2001 recession the paper-billspread had brief moderate spikes in October1998 October 1999 and June 2000 but thespread was small and declining from August2000 through the end of 2001 (Stock andWatson 2003b)

We know of little work examining the pre-dictive content of default spreads ineconomies other than the United StatesBernanke and Mishkin (1992) report a pre-liminary investigation but they questionedthe adequacy of their private debt interestrate data (the counterpart of the commercialpaper rate in the United States) for severalcountries Finding sufficiently long timeseries data on reliable market prices of suit-able private debt instruments has been abarrier to international comparisons on therole of the default spread

Some studies examined the predictivecontent of the default spread for inflationFriedman and Kuttner (1992) found littlepredictive content of the paper-bill spreadfor inflation using Granger causality testsConsistent with this Feldstein and Stock(1994) found that although the paper-billspread was a significant in-sample predictorof real GDP it did not significantly enterequations predicting nominal GDP

Four nonexclusive arguments have beenput forth on why the paperndashbill spread hadpredictive content for output growth duringthe 1960s and 1970s Stock and Watson(1989) suggested the predictive contentarises from expectations of default risk

Stock and Watson Forecasting Output and Inflation 797

which are in turn based on private expecta-tions of sales and profits Bernanke (1990)and Bernanke and Blinder (1992) arguedinstead that the paper-bill spread is a sensi-tive measure of monetary policy and this isthe main source of its predictive contentFriedman and Kuttner (1993ab) suggestedthat the spread is detecting influences ofsupply and demand (ie liquidity) in themarket for private debt this emphasis is sim-ilar to Cookrsquos (1981) attribution of move-ments in such spreads to supply and demandconsiderations Finally Thoma and Gray(1994) and Emery (1996) argued that thepredictive content is largely coincidental theconsequence of one-time events

There has been some examination of otherspreads in this literature Mark Gertler andLown (2000) take the view that because ofthe credit channel theory of monetary policytransmission the premise of using a defaultspread to predict future output is sound butthat the paper-bill spread is a flawed choicefor institutional reasons Instead they sug-gest using the high-yield bond (ldquojunkbondrdquo)ndashAaa spread instead The junk bondmarket was only developed in the 1980s inthe United States so this spread has a shorttime series Still Gertler and Lown (2000)present in-sample evidence that its explana-tory power was strong throughout this pe-riod This is notable because the paper-billspread (and as was noted above the termspread) have substantially reduced or no pre-dictive content for output growth in theUnited States during this period HoweverDucarsquos (1999) concerns about default spreadsin general extend to the junk bond-Aaaspread as well he suggests the spike in thejunk bond spread in the late 1980s and early1990s (which is key to this spreadrsquos signal ofthe 1990 recession) was a coincidental conse-quence of the aftermath of the thrift crisis inwhich thrifts were forced to sell their junkbond holdings in an illiquid market

Stock prices and dividend yields If theprice of a stock equals the expected dis-counted value of future earnings then stock

returns should be useful in forecasting earn-ings growth or more broadly output growthThe empirical link between stock prices andeconomic activity has been noted at leastsince Mitchell and Burns (1938) see StanleyFischer and Robert Merton (1984) andRobert Barro (1990) Upon closer inspec-tion however this link is murky Stockreturns generally do not have substantial in-sample predictive content for future outputeven in bivariate regressions with no laggeddependent variables (Fama 1981 and Harvey1989) and any predictive content is reducedby including lagged output growth Thisminimal marginal predictive content is foundboth in linear regressions predicting outputgrowth (Stock and Watson 1989 1999a) andin probit regressions of binary recessionevents (Estrella and Mishkin 1998)

In his review article Campbell (1999)shows that in a simple loglinear representa-tive agent model the log price-dividend ratioembodies rational discounted forecasts ofdividend growth rates and stock returnsmaking it an appropriate state variable to usefor forecasting But in his internationaldataset (fifteen countries sample periodsmainly 1970s to 1996 Campbell (1999)found that the log dividend price ratio has lit-tle predictive content for output This is con-sistent with the generally negative conclu-sions in the larger literature that examinesthe predictive content of stock returnsdirectly These generally negative findingsprovide a precise reprise of the witticism thatthe stock market has predicted nine of thelast four recessions

Campbell et al (2001) proposed an inter-esting variant in which the variance of stockreturns rather than the returns themselvescould have predictive content for outputgrowth Using in-sample statistics theyfound evidence that high volatility in onequarter signals low growth in the next quar-ter as it might if high volatility was associ-ated with increased doubts about short-termeconomic prospects When Hui Guo (2002)used out-of-sample statistics however the

798 Journal of Economic Literature Vol XLI (September 2003)

evidence for predictive content was substan-tially weaker These findings are consistentwith the predictive content of stock marketvolatility being stronger during someepisodes than during others

Few studies have examined the predictivecontent of stock prices for inflation One isCharles Goodhart and Boris Hofmann(2000a) who found that stock returns do nothave marginal predictive content for infla-tion in their international data set (twelvedeveloped economies quarterly data mainly1970ndash98 or shorter)

Other financial indicators Exchangerates are a channel through which inflationcan be imported in open economies In theUnited States exchange rates (or a measureof the terms of trade) have long entered con-ventional Phillips curves Robert Gordon(1982 1998) finds these exchange rates sta-tistically significant based on in-sample testsIn their international dataset howeverGoodhart and Hofmann (2000b) find thatpseudo out-of-sample forecasts of inflationusing exchange rates and lagged inflationoutperformed autoregressive forecasts inonly one or two of their seventeen countriesdepending on the horizon At least in theUS data there is also little evidence thatexchange rates predict output growth (egStock and Watson 1999a)

One problem with the nominal termstructure as a predictor of inflation is thatunder the expectations hypothesis the for-ward rate embodies forecasts of both infla-tion and future real rates In theory one caneliminate the expected future real rates byusing spreads between forward rates in theterm structures of nominal and real debt ofmatched maturity and matched bearer riskIn practice one of the few cases for whichthis is possible with time series of a reason-able length is for British index-linked bondsDavid Barr and Campbell (1997) investi-gated the (bivariate in-sample) predictivecontent of these implicit inflation expecta-tions and found that they had better predic-tive content for inflation than forward rates

obtained solely from the nominal term struc-ture They provided no evidence on Grangercausality or marginal predictive content ofthese implicit inflation expectations in multi-variate regressions

Lettau and Sydney Ludvigson (2001) pro-posed a novel indicator the log of the consumption-wealth ratio They argue thatin a representative consumer model with nostickiness in consumption the log ratio ofconsumption to total wealth (human andnonhuman) should predict the return on themarket portfolio They find empirically thattheir version of the consumptionndashwealthratio (a cointegrating residual between con-sumption of nondurables financial wealthand labor income all in logarithms) has pre-dictive content for multiyear stock returns(both real returns and excess returns) Ifconsumption is sticky it could also have pre-dictive content for consumption growthHowever Ludvigson and Charles Steindel(1999) and Lettau and Ludvigson (2000)find that this indicator does not predict con-sumption growth or income growth in theUnited States one quarter ahead

Housing constitutes a large component ofaggregate wealth and gets significant weightin the CPI in many countries More gener-ally housing is a volatile and cyclically sensi-tive sector and measures of real activity inthe housing sector are known to be usefulleading indicators of economic activity atleast in the United States (Stock and Watson1989 1999a) suggesting a broader channelby which housing prices might forecast realactivity inflation or both In the UnitedStates housing starts (a real quantity mea-sure) have some predictive content for infla-tion (Stock 1998 Stock and Watson 1999b)Studies of the predictive content of housingprices confront difficult data problems how-ever Goodhart and Hofmann (2000a) con-structed a housing price data set for twelveOECD countries (extended to seventeencountries in Goodhart and Hofmann(2000b) They found that residential housinginflation has significant in-sample marginal

Stock and Watson Forecasting Output and Inflation 799

predictive content for overall inflation in afew of the several countries they studyalthough in several countries they usedinterpolated annual data which makes thoseresults difficult to assess

Nonlinear models The foregoing discus-sion has focused on statistical models inwhich the forecasts are linear functions of thepredictors even the recession predictionmodels estimated using probit regressionsare essentially linear in the sense that the pre-dictive index (the argument of the probitfunction) is a linear function of the predic-tors Might the problems of instability andepisodically poor predictive content stemfrom inherent nonlinearities in the forecast-ing relation The evidence on this proposi-tion is limited and mixed Ted Jaditz LeighRiddick and Chera Sayers (1998) examinelinear and nonlinear models of US indus-trial production using asset price predictorsand conclude that combined nonlinear fore-casts improve upon simple linear modelsadditionally Greg Tkacz (2001) reportsimprovements of nonlinear models over lin-ear models for forecasting Canadian GDPOn the other hand John Galbraith and Tkacz(2000) find limited international evidence ofnonlinearity in the outputndashterm spread rela-tion Similarly in their pseudo out-of-samplecomparison of VAR to multivariate neuralnetwork forecasts Norman Swanson andHalbert White (1997) concluded that the lin-ear forecasts generally performed better forvarious measures of US economic activityand inflation Swanson and White (1995)reached similar conclusions when forecastinginterest rates Given this limited evidence wecannot rule out the possibility that the rightnonlinear model will produce stable and reli-able forecasts of output and inflation usinginterest rates but this ldquorightrdquo nonlinearmodel has yet to be found

32 Forecasts Using Nonfinancial Variables

The literature on forecasting output andinflation with nonfinancial variables is mas-

sive see Stock and Watson (1999a) for anextensive review of the US evidence Thissection highlights a few recent studies onthis topic

The use of nonfinancial variables to forecastinflation has to a large extent focused onidentifying suitable measures of output gapsthat is estimating generalized Phillips curvesIn the United States the unemployment-based Phillips curve with a constant non-accelerating inflation rate of unemployment(NAIRU) has recently been unstable predict-ing accelerating inflation during a time thatinflation was in fact low and steady or fallingThis instability has been widely documentedsee for example Gordon (1997 1998) andDouglas Staiger Stock and Watson (1997ab2001) One reaction to this instability has beento suggest that the NAIRU was falling in theUnited States during the 1990s Mechanicallythis keeps the unemployment-based Phillipscurve on track and it makes sense in the con-text of changes in the US labor market and inthe economy generally cf Lawrence Katzand Alan Krueger (1999) However an impre-cisely estimated time-varying NAIRU makesforecasting using the unemployment-basedPhillips curve problematic

A different reaction to this time variation inthe NAIRU has been to see if there are alter-native predictive relations that have beenmore stable Staiger Stock and Watson(1997a) considered 71 candidate leadingindicators of inflation both financial and non-financial (quarterly US) and in a similar butmore thorough exercise Stock and Watson(1999b) considered 167 candidate leadingindicators (monthly US) They found a fewindicators that have been stable predictors ofinflation the prime example being the capac-ity utilization rate Gordon (1998) and Stock(1998) confirmed the accuracy of recent USinflation forecasts based on the capacity uti-lization rate Stock and Watson (1999b) alsosuggested an alternative Phillips-curve-typeforecast based on a single aggregate activityindex computed using 85 individual measuresof real aggregate activity

800 Journal of Economic Literature Vol XLI (September 2003)

Recently Andrew Atkeson and LeeOhanian (2001) challenged the usefulness ofall inflation forecasts based on the Phillipscurve and its variants They showed that forthe United States from 1984 to 2001 pub-lished inflation forecasts and pseudo out-of-sample Phillips curve forecasts did not beata seasonal random walk forecast of annualinflation Their finding poses a significantchallenge to all attempts to forecast inflationand we return to it in our empirical analysis

The international evidence on the suitabil-ity of output gaps and the Phillips Curve forforecasting inflation is mixed Simple unem-ployment-based models with a constantNAIRU fail in Europe which is one way tostate the phenomenon of so-called hysteresisin the unemployment rate More sophisti-cated and flexible statistical tools for esti-mating the NAIRU can improve in-samplefits for the European data (Thomas Laubach2001) but their value for forecasting is ques-tionable because of imprecision in the esti-mated NAIRU at the end of the sampleSimilarly inflation forecasts based on an out-put gap rather than the unemployment rateface the practical problem of estimating thegap at the end of the sample which neces-sarily introduces a one-sided estimate andassociated imprecision Evidence inMassimiliano Marcellino Stock and Watson(2000) suggests that the ability of output gapmodels to forecast inflation in Europe ismore limited than in the United States

Finally there is evidence (from US data)that the inflation process itself as well as pre-dictive relations based on it is time-varyingWilliam Brainard and George Perry (2000)and Timothy Cogley and Thomas Sargent(2001 2002) suggested that the persistence ofUS inflation was high in the 1970s and 1980sbut subsequently declined although this con-clusion appears to be sensitive to the methodused to measure persistence (FredericPivetta and Ricardo Reis 2002) GeorgeAkerlof William Dickens and Perry (2000)provided a model based on near-rationalbehavior that motivates a nonlinear Phillips

curve which they interpreted as consistentwith the Brainard and Perry (2000) evidence

In a similar vein Stephen Cecchetti RitaChu and Charles Steindel (2000) performeda pseudo out-of-sample forecasting experi-ment on various candidate leading indicatorsof inflation from 1985 to 1998 in the UnitedStates including interest rates term anddefault spreads and several nonfinancialindicators They concluded that none ofthese indicators financial or nonfinancialreliably predicts inflation in bivariate fore-casting models and that there are very fewyears in which financial variables outperforma simple autoregression Because theyassessed performance on a year-by-yearbasis these findings have great samplingvariability and it is difficult to know howmuch of this is due to true instability Stilltheir findings are consistent with Stock andWatsonrsquos (1996) results based on formal sta-bility tests that time variation in thesereduced form bivariate predictive relationsis widespread in the US data

33 Discussion

An econometrician might quibble withsome aspects of this literature Many ofthese studies fail to include lagged endoge-nous variables and thus do not assess mar-ginal predictive content Results oftenchange when marginal predictive content isconsidered (the predictive content of theterm spread for inflation is an example)Many of the regressions involve overlappingreturns and when the overlap period is largerelative to the sample size the distribution ofin-sample t-statistics and R2s becomes non-standard Some regressors such as the divi-dend yield and the term spread are highlypersistent and even if they do not have aunit root this persistence causes conventionalinference methods to break down These latter two problems combined make it evenmore difficult to do reliable inference andvery few of these studies mention far lesstackle either of these difficulties with their

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 9: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

796 Journal of Economic Literature Vol XLI (September 2003)

and Watson (1989) and Friedman andKuttner (1992) studied the default spread asa predictor of real growth in the postwarperiod they found that the spread betweencommercial paper and US Treasury bills ofthe same maturity (three or six months theldquopaper-billrdquo spread) was a potent predictorof output growth (monthly data 1959ndash88 forStock and Watson 1989 quarterly data1960ndash90 for Friedman and Kuttner 1992)Using in-sample statistics Friedman andKuttner (1992) concluded that upon con-trolling for the paper-bill spread monetaryaggregates and interest rates have little predictive content for real output a findingconfirmed by Bernanke and Blinder (1992)and Martin Feldstein and Stock (1994)

Subsequent literature focused on whetherthis predictive relationship is stable overtime Bernanke (1990) used in-sample statis-tics to confirm the strong performance ofpaper-bill spread as predictor of output butby splitting up the sample he also suggestedthat this relation weakened during the1980s This view was affirmed and assertedmore strongly by Mark Thoma and Jo AnnaGray (1994) Rik Hafer and Ali Kutan(1992) and Kenneth Emery (1996) Thomaand Gray (1994) for example found that thepaper-bill spread has strong in-sampleexplanatory power in recursive or rollingregressions but little predictive power inpseudo out-of-sample forecasting exercisesover the 1980s Emery (1996) finds little in-sample explanatory power of the paper-billspread in samples that postdate 1980 Theseauthors interpreted this as a consequence ofspecial events especially in 1973ndash74 whichcontribute to a good in-sample fit but notnecessarily good forecasting performanceDrawing on institutional considerationsJohn Duca (1999) also took this view Ducarsquos(1999) concerns echo Timothy Cookrsquos (1981)warnings about how the changing institu-tional environment and financial innovationscould substantially change markets for short-term debt and thereby alter the relationshipbetween default spreads and real activity

One obvious true out-of-sample predic-tive failure of the paper-bill spread is its fail-ure to rise sharply in advance of the 1990ndash91US recession In their post-mortemFriedman and Kuttner (1998) suggested thatthis predictive failure arose because the1990ndash91 recession was caused in large partby nonmonetary events that would not havebeen detected by the paper-bill spread Theyfurther argued that there were changes inthe commercial paper market unrelated tothe recession that also led to this predictivefailure Similarly the paper-bill spread failedto forecast the 2001 recession the paper-billspread had brief moderate spikes in October1998 October 1999 and June 2000 but thespread was small and declining from August2000 through the end of 2001 (Stock andWatson 2003b)

We know of little work examining the pre-dictive content of default spreads ineconomies other than the United StatesBernanke and Mishkin (1992) report a pre-liminary investigation but they questionedthe adequacy of their private debt interestrate data (the counterpart of the commercialpaper rate in the United States) for severalcountries Finding sufficiently long timeseries data on reliable market prices of suit-able private debt instruments has been abarrier to international comparisons on therole of the default spread

Some studies examined the predictivecontent of the default spread for inflationFriedman and Kuttner (1992) found littlepredictive content of the paper-bill spreadfor inflation using Granger causality testsConsistent with this Feldstein and Stock(1994) found that although the paper-billspread was a significant in-sample predictorof real GDP it did not significantly enterequations predicting nominal GDP

Four nonexclusive arguments have beenput forth on why the paperndashbill spread hadpredictive content for output growth duringthe 1960s and 1970s Stock and Watson(1989) suggested the predictive contentarises from expectations of default risk

Stock and Watson Forecasting Output and Inflation 797

which are in turn based on private expecta-tions of sales and profits Bernanke (1990)and Bernanke and Blinder (1992) arguedinstead that the paper-bill spread is a sensi-tive measure of monetary policy and this isthe main source of its predictive contentFriedman and Kuttner (1993ab) suggestedthat the spread is detecting influences ofsupply and demand (ie liquidity) in themarket for private debt this emphasis is sim-ilar to Cookrsquos (1981) attribution of move-ments in such spreads to supply and demandconsiderations Finally Thoma and Gray(1994) and Emery (1996) argued that thepredictive content is largely coincidental theconsequence of one-time events

There has been some examination of otherspreads in this literature Mark Gertler andLown (2000) take the view that because ofthe credit channel theory of monetary policytransmission the premise of using a defaultspread to predict future output is sound butthat the paper-bill spread is a flawed choicefor institutional reasons Instead they sug-gest using the high-yield bond (ldquojunkbondrdquo)ndashAaa spread instead The junk bondmarket was only developed in the 1980s inthe United States so this spread has a shorttime series Still Gertler and Lown (2000)present in-sample evidence that its explana-tory power was strong throughout this pe-riod This is notable because the paper-billspread (and as was noted above the termspread) have substantially reduced or no pre-dictive content for output growth in theUnited States during this period HoweverDucarsquos (1999) concerns about default spreadsin general extend to the junk bond-Aaaspread as well he suggests the spike in thejunk bond spread in the late 1980s and early1990s (which is key to this spreadrsquos signal ofthe 1990 recession) was a coincidental conse-quence of the aftermath of the thrift crisis inwhich thrifts were forced to sell their junkbond holdings in an illiquid market

Stock prices and dividend yields If theprice of a stock equals the expected dis-counted value of future earnings then stock

returns should be useful in forecasting earn-ings growth or more broadly output growthThe empirical link between stock prices andeconomic activity has been noted at leastsince Mitchell and Burns (1938) see StanleyFischer and Robert Merton (1984) andRobert Barro (1990) Upon closer inspec-tion however this link is murky Stockreturns generally do not have substantial in-sample predictive content for future outputeven in bivariate regressions with no laggeddependent variables (Fama 1981 and Harvey1989) and any predictive content is reducedby including lagged output growth Thisminimal marginal predictive content is foundboth in linear regressions predicting outputgrowth (Stock and Watson 1989 1999a) andin probit regressions of binary recessionevents (Estrella and Mishkin 1998)

In his review article Campbell (1999)shows that in a simple loglinear representa-tive agent model the log price-dividend ratioembodies rational discounted forecasts ofdividend growth rates and stock returnsmaking it an appropriate state variable to usefor forecasting But in his internationaldataset (fifteen countries sample periodsmainly 1970s to 1996 Campbell (1999)found that the log dividend price ratio has lit-tle predictive content for output This is con-sistent with the generally negative conclu-sions in the larger literature that examinesthe predictive content of stock returnsdirectly These generally negative findingsprovide a precise reprise of the witticism thatthe stock market has predicted nine of thelast four recessions

Campbell et al (2001) proposed an inter-esting variant in which the variance of stockreturns rather than the returns themselvescould have predictive content for outputgrowth Using in-sample statistics theyfound evidence that high volatility in onequarter signals low growth in the next quar-ter as it might if high volatility was associ-ated with increased doubts about short-termeconomic prospects When Hui Guo (2002)used out-of-sample statistics however the

798 Journal of Economic Literature Vol XLI (September 2003)

evidence for predictive content was substan-tially weaker These findings are consistentwith the predictive content of stock marketvolatility being stronger during someepisodes than during others

Few studies have examined the predictivecontent of stock prices for inflation One isCharles Goodhart and Boris Hofmann(2000a) who found that stock returns do nothave marginal predictive content for infla-tion in their international data set (twelvedeveloped economies quarterly data mainly1970ndash98 or shorter)

Other financial indicators Exchangerates are a channel through which inflationcan be imported in open economies In theUnited States exchange rates (or a measureof the terms of trade) have long entered con-ventional Phillips curves Robert Gordon(1982 1998) finds these exchange rates sta-tistically significant based on in-sample testsIn their international dataset howeverGoodhart and Hofmann (2000b) find thatpseudo out-of-sample forecasts of inflationusing exchange rates and lagged inflationoutperformed autoregressive forecasts inonly one or two of their seventeen countriesdepending on the horizon At least in theUS data there is also little evidence thatexchange rates predict output growth (egStock and Watson 1999a)

One problem with the nominal termstructure as a predictor of inflation is thatunder the expectations hypothesis the for-ward rate embodies forecasts of both infla-tion and future real rates In theory one caneliminate the expected future real rates byusing spreads between forward rates in theterm structures of nominal and real debt ofmatched maturity and matched bearer riskIn practice one of the few cases for whichthis is possible with time series of a reason-able length is for British index-linked bondsDavid Barr and Campbell (1997) investi-gated the (bivariate in-sample) predictivecontent of these implicit inflation expecta-tions and found that they had better predic-tive content for inflation than forward rates

obtained solely from the nominal term struc-ture They provided no evidence on Grangercausality or marginal predictive content ofthese implicit inflation expectations in multi-variate regressions

Lettau and Sydney Ludvigson (2001) pro-posed a novel indicator the log of the consumption-wealth ratio They argue thatin a representative consumer model with nostickiness in consumption the log ratio ofconsumption to total wealth (human andnonhuman) should predict the return on themarket portfolio They find empirically thattheir version of the consumptionndashwealthratio (a cointegrating residual between con-sumption of nondurables financial wealthand labor income all in logarithms) has pre-dictive content for multiyear stock returns(both real returns and excess returns) Ifconsumption is sticky it could also have pre-dictive content for consumption growthHowever Ludvigson and Charles Steindel(1999) and Lettau and Ludvigson (2000)find that this indicator does not predict con-sumption growth or income growth in theUnited States one quarter ahead

Housing constitutes a large component ofaggregate wealth and gets significant weightin the CPI in many countries More gener-ally housing is a volatile and cyclically sensi-tive sector and measures of real activity inthe housing sector are known to be usefulleading indicators of economic activity atleast in the United States (Stock and Watson1989 1999a) suggesting a broader channelby which housing prices might forecast realactivity inflation or both In the UnitedStates housing starts (a real quantity mea-sure) have some predictive content for infla-tion (Stock 1998 Stock and Watson 1999b)Studies of the predictive content of housingprices confront difficult data problems how-ever Goodhart and Hofmann (2000a) con-structed a housing price data set for twelveOECD countries (extended to seventeencountries in Goodhart and Hofmann(2000b) They found that residential housinginflation has significant in-sample marginal

Stock and Watson Forecasting Output and Inflation 799

predictive content for overall inflation in afew of the several countries they studyalthough in several countries they usedinterpolated annual data which makes thoseresults difficult to assess

Nonlinear models The foregoing discus-sion has focused on statistical models inwhich the forecasts are linear functions of thepredictors even the recession predictionmodels estimated using probit regressionsare essentially linear in the sense that the pre-dictive index (the argument of the probitfunction) is a linear function of the predic-tors Might the problems of instability andepisodically poor predictive content stemfrom inherent nonlinearities in the forecast-ing relation The evidence on this proposi-tion is limited and mixed Ted Jaditz LeighRiddick and Chera Sayers (1998) examinelinear and nonlinear models of US indus-trial production using asset price predictorsand conclude that combined nonlinear fore-casts improve upon simple linear modelsadditionally Greg Tkacz (2001) reportsimprovements of nonlinear models over lin-ear models for forecasting Canadian GDPOn the other hand John Galbraith and Tkacz(2000) find limited international evidence ofnonlinearity in the outputndashterm spread rela-tion Similarly in their pseudo out-of-samplecomparison of VAR to multivariate neuralnetwork forecasts Norman Swanson andHalbert White (1997) concluded that the lin-ear forecasts generally performed better forvarious measures of US economic activityand inflation Swanson and White (1995)reached similar conclusions when forecastinginterest rates Given this limited evidence wecannot rule out the possibility that the rightnonlinear model will produce stable and reli-able forecasts of output and inflation usinginterest rates but this ldquorightrdquo nonlinearmodel has yet to be found

32 Forecasts Using Nonfinancial Variables

The literature on forecasting output andinflation with nonfinancial variables is mas-

sive see Stock and Watson (1999a) for anextensive review of the US evidence Thissection highlights a few recent studies onthis topic

The use of nonfinancial variables to forecastinflation has to a large extent focused onidentifying suitable measures of output gapsthat is estimating generalized Phillips curvesIn the United States the unemployment-based Phillips curve with a constant non-accelerating inflation rate of unemployment(NAIRU) has recently been unstable predict-ing accelerating inflation during a time thatinflation was in fact low and steady or fallingThis instability has been widely documentedsee for example Gordon (1997 1998) andDouglas Staiger Stock and Watson (1997ab2001) One reaction to this instability has beento suggest that the NAIRU was falling in theUnited States during the 1990s Mechanicallythis keeps the unemployment-based Phillipscurve on track and it makes sense in the con-text of changes in the US labor market and inthe economy generally cf Lawrence Katzand Alan Krueger (1999) However an impre-cisely estimated time-varying NAIRU makesforecasting using the unemployment-basedPhillips curve problematic

A different reaction to this time variation inthe NAIRU has been to see if there are alter-native predictive relations that have beenmore stable Staiger Stock and Watson(1997a) considered 71 candidate leadingindicators of inflation both financial and non-financial (quarterly US) and in a similar butmore thorough exercise Stock and Watson(1999b) considered 167 candidate leadingindicators (monthly US) They found a fewindicators that have been stable predictors ofinflation the prime example being the capac-ity utilization rate Gordon (1998) and Stock(1998) confirmed the accuracy of recent USinflation forecasts based on the capacity uti-lization rate Stock and Watson (1999b) alsosuggested an alternative Phillips-curve-typeforecast based on a single aggregate activityindex computed using 85 individual measuresof real aggregate activity

800 Journal of Economic Literature Vol XLI (September 2003)

Recently Andrew Atkeson and LeeOhanian (2001) challenged the usefulness ofall inflation forecasts based on the Phillipscurve and its variants They showed that forthe United States from 1984 to 2001 pub-lished inflation forecasts and pseudo out-of-sample Phillips curve forecasts did not beata seasonal random walk forecast of annualinflation Their finding poses a significantchallenge to all attempts to forecast inflationand we return to it in our empirical analysis

The international evidence on the suitabil-ity of output gaps and the Phillips Curve forforecasting inflation is mixed Simple unem-ployment-based models with a constantNAIRU fail in Europe which is one way tostate the phenomenon of so-called hysteresisin the unemployment rate More sophisti-cated and flexible statistical tools for esti-mating the NAIRU can improve in-samplefits for the European data (Thomas Laubach2001) but their value for forecasting is ques-tionable because of imprecision in the esti-mated NAIRU at the end of the sampleSimilarly inflation forecasts based on an out-put gap rather than the unemployment rateface the practical problem of estimating thegap at the end of the sample which neces-sarily introduces a one-sided estimate andassociated imprecision Evidence inMassimiliano Marcellino Stock and Watson(2000) suggests that the ability of output gapmodels to forecast inflation in Europe ismore limited than in the United States

Finally there is evidence (from US data)that the inflation process itself as well as pre-dictive relations based on it is time-varyingWilliam Brainard and George Perry (2000)and Timothy Cogley and Thomas Sargent(2001 2002) suggested that the persistence ofUS inflation was high in the 1970s and 1980sbut subsequently declined although this con-clusion appears to be sensitive to the methodused to measure persistence (FredericPivetta and Ricardo Reis 2002) GeorgeAkerlof William Dickens and Perry (2000)provided a model based on near-rationalbehavior that motivates a nonlinear Phillips

curve which they interpreted as consistentwith the Brainard and Perry (2000) evidence

In a similar vein Stephen Cecchetti RitaChu and Charles Steindel (2000) performeda pseudo out-of-sample forecasting experi-ment on various candidate leading indicatorsof inflation from 1985 to 1998 in the UnitedStates including interest rates term anddefault spreads and several nonfinancialindicators They concluded that none ofthese indicators financial or nonfinancialreliably predicts inflation in bivariate fore-casting models and that there are very fewyears in which financial variables outperforma simple autoregression Because theyassessed performance on a year-by-yearbasis these findings have great samplingvariability and it is difficult to know howmuch of this is due to true instability Stilltheir findings are consistent with Stock andWatsonrsquos (1996) results based on formal sta-bility tests that time variation in thesereduced form bivariate predictive relationsis widespread in the US data

33 Discussion

An econometrician might quibble withsome aspects of this literature Many ofthese studies fail to include lagged endoge-nous variables and thus do not assess mar-ginal predictive content Results oftenchange when marginal predictive content isconsidered (the predictive content of theterm spread for inflation is an example)Many of the regressions involve overlappingreturns and when the overlap period is largerelative to the sample size the distribution ofin-sample t-statistics and R2s becomes non-standard Some regressors such as the divi-dend yield and the term spread are highlypersistent and even if they do not have aunit root this persistence causes conventionalinference methods to break down These latter two problems combined make it evenmore difficult to do reliable inference andvery few of these studies mention far lesstackle either of these difficulties with their

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 10: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 797

which are in turn based on private expecta-tions of sales and profits Bernanke (1990)and Bernanke and Blinder (1992) arguedinstead that the paper-bill spread is a sensi-tive measure of monetary policy and this isthe main source of its predictive contentFriedman and Kuttner (1993ab) suggestedthat the spread is detecting influences ofsupply and demand (ie liquidity) in themarket for private debt this emphasis is sim-ilar to Cookrsquos (1981) attribution of move-ments in such spreads to supply and demandconsiderations Finally Thoma and Gray(1994) and Emery (1996) argued that thepredictive content is largely coincidental theconsequence of one-time events

There has been some examination of otherspreads in this literature Mark Gertler andLown (2000) take the view that because ofthe credit channel theory of monetary policytransmission the premise of using a defaultspread to predict future output is sound butthat the paper-bill spread is a flawed choicefor institutional reasons Instead they sug-gest using the high-yield bond (ldquojunkbondrdquo)ndashAaa spread instead The junk bondmarket was only developed in the 1980s inthe United States so this spread has a shorttime series Still Gertler and Lown (2000)present in-sample evidence that its explana-tory power was strong throughout this pe-riod This is notable because the paper-billspread (and as was noted above the termspread) have substantially reduced or no pre-dictive content for output growth in theUnited States during this period HoweverDucarsquos (1999) concerns about default spreadsin general extend to the junk bond-Aaaspread as well he suggests the spike in thejunk bond spread in the late 1980s and early1990s (which is key to this spreadrsquos signal ofthe 1990 recession) was a coincidental conse-quence of the aftermath of the thrift crisis inwhich thrifts were forced to sell their junkbond holdings in an illiquid market

Stock prices and dividend yields If theprice of a stock equals the expected dis-counted value of future earnings then stock

returns should be useful in forecasting earn-ings growth or more broadly output growthThe empirical link between stock prices andeconomic activity has been noted at leastsince Mitchell and Burns (1938) see StanleyFischer and Robert Merton (1984) andRobert Barro (1990) Upon closer inspec-tion however this link is murky Stockreturns generally do not have substantial in-sample predictive content for future outputeven in bivariate regressions with no laggeddependent variables (Fama 1981 and Harvey1989) and any predictive content is reducedby including lagged output growth Thisminimal marginal predictive content is foundboth in linear regressions predicting outputgrowth (Stock and Watson 1989 1999a) andin probit regressions of binary recessionevents (Estrella and Mishkin 1998)

In his review article Campbell (1999)shows that in a simple loglinear representa-tive agent model the log price-dividend ratioembodies rational discounted forecasts ofdividend growth rates and stock returnsmaking it an appropriate state variable to usefor forecasting But in his internationaldataset (fifteen countries sample periodsmainly 1970s to 1996 Campbell (1999)found that the log dividend price ratio has lit-tle predictive content for output This is con-sistent with the generally negative conclu-sions in the larger literature that examinesthe predictive content of stock returnsdirectly These generally negative findingsprovide a precise reprise of the witticism thatthe stock market has predicted nine of thelast four recessions

Campbell et al (2001) proposed an inter-esting variant in which the variance of stockreturns rather than the returns themselvescould have predictive content for outputgrowth Using in-sample statistics theyfound evidence that high volatility in onequarter signals low growth in the next quar-ter as it might if high volatility was associ-ated with increased doubts about short-termeconomic prospects When Hui Guo (2002)used out-of-sample statistics however the

798 Journal of Economic Literature Vol XLI (September 2003)

evidence for predictive content was substan-tially weaker These findings are consistentwith the predictive content of stock marketvolatility being stronger during someepisodes than during others

Few studies have examined the predictivecontent of stock prices for inflation One isCharles Goodhart and Boris Hofmann(2000a) who found that stock returns do nothave marginal predictive content for infla-tion in their international data set (twelvedeveloped economies quarterly data mainly1970ndash98 or shorter)

Other financial indicators Exchangerates are a channel through which inflationcan be imported in open economies In theUnited States exchange rates (or a measureof the terms of trade) have long entered con-ventional Phillips curves Robert Gordon(1982 1998) finds these exchange rates sta-tistically significant based on in-sample testsIn their international dataset howeverGoodhart and Hofmann (2000b) find thatpseudo out-of-sample forecasts of inflationusing exchange rates and lagged inflationoutperformed autoregressive forecasts inonly one or two of their seventeen countriesdepending on the horizon At least in theUS data there is also little evidence thatexchange rates predict output growth (egStock and Watson 1999a)

One problem with the nominal termstructure as a predictor of inflation is thatunder the expectations hypothesis the for-ward rate embodies forecasts of both infla-tion and future real rates In theory one caneliminate the expected future real rates byusing spreads between forward rates in theterm structures of nominal and real debt ofmatched maturity and matched bearer riskIn practice one of the few cases for whichthis is possible with time series of a reason-able length is for British index-linked bondsDavid Barr and Campbell (1997) investi-gated the (bivariate in-sample) predictivecontent of these implicit inflation expecta-tions and found that they had better predic-tive content for inflation than forward rates

obtained solely from the nominal term struc-ture They provided no evidence on Grangercausality or marginal predictive content ofthese implicit inflation expectations in multi-variate regressions

Lettau and Sydney Ludvigson (2001) pro-posed a novel indicator the log of the consumption-wealth ratio They argue thatin a representative consumer model with nostickiness in consumption the log ratio ofconsumption to total wealth (human andnonhuman) should predict the return on themarket portfolio They find empirically thattheir version of the consumptionndashwealthratio (a cointegrating residual between con-sumption of nondurables financial wealthand labor income all in logarithms) has pre-dictive content for multiyear stock returns(both real returns and excess returns) Ifconsumption is sticky it could also have pre-dictive content for consumption growthHowever Ludvigson and Charles Steindel(1999) and Lettau and Ludvigson (2000)find that this indicator does not predict con-sumption growth or income growth in theUnited States one quarter ahead

Housing constitutes a large component ofaggregate wealth and gets significant weightin the CPI in many countries More gener-ally housing is a volatile and cyclically sensi-tive sector and measures of real activity inthe housing sector are known to be usefulleading indicators of economic activity atleast in the United States (Stock and Watson1989 1999a) suggesting a broader channelby which housing prices might forecast realactivity inflation or both In the UnitedStates housing starts (a real quantity mea-sure) have some predictive content for infla-tion (Stock 1998 Stock and Watson 1999b)Studies of the predictive content of housingprices confront difficult data problems how-ever Goodhart and Hofmann (2000a) con-structed a housing price data set for twelveOECD countries (extended to seventeencountries in Goodhart and Hofmann(2000b) They found that residential housinginflation has significant in-sample marginal

Stock and Watson Forecasting Output and Inflation 799

predictive content for overall inflation in afew of the several countries they studyalthough in several countries they usedinterpolated annual data which makes thoseresults difficult to assess

Nonlinear models The foregoing discus-sion has focused on statistical models inwhich the forecasts are linear functions of thepredictors even the recession predictionmodels estimated using probit regressionsare essentially linear in the sense that the pre-dictive index (the argument of the probitfunction) is a linear function of the predic-tors Might the problems of instability andepisodically poor predictive content stemfrom inherent nonlinearities in the forecast-ing relation The evidence on this proposi-tion is limited and mixed Ted Jaditz LeighRiddick and Chera Sayers (1998) examinelinear and nonlinear models of US indus-trial production using asset price predictorsand conclude that combined nonlinear fore-casts improve upon simple linear modelsadditionally Greg Tkacz (2001) reportsimprovements of nonlinear models over lin-ear models for forecasting Canadian GDPOn the other hand John Galbraith and Tkacz(2000) find limited international evidence ofnonlinearity in the outputndashterm spread rela-tion Similarly in their pseudo out-of-samplecomparison of VAR to multivariate neuralnetwork forecasts Norman Swanson andHalbert White (1997) concluded that the lin-ear forecasts generally performed better forvarious measures of US economic activityand inflation Swanson and White (1995)reached similar conclusions when forecastinginterest rates Given this limited evidence wecannot rule out the possibility that the rightnonlinear model will produce stable and reli-able forecasts of output and inflation usinginterest rates but this ldquorightrdquo nonlinearmodel has yet to be found

32 Forecasts Using Nonfinancial Variables

The literature on forecasting output andinflation with nonfinancial variables is mas-

sive see Stock and Watson (1999a) for anextensive review of the US evidence Thissection highlights a few recent studies onthis topic

The use of nonfinancial variables to forecastinflation has to a large extent focused onidentifying suitable measures of output gapsthat is estimating generalized Phillips curvesIn the United States the unemployment-based Phillips curve with a constant non-accelerating inflation rate of unemployment(NAIRU) has recently been unstable predict-ing accelerating inflation during a time thatinflation was in fact low and steady or fallingThis instability has been widely documentedsee for example Gordon (1997 1998) andDouglas Staiger Stock and Watson (1997ab2001) One reaction to this instability has beento suggest that the NAIRU was falling in theUnited States during the 1990s Mechanicallythis keeps the unemployment-based Phillipscurve on track and it makes sense in the con-text of changes in the US labor market and inthe economy generally cf Lawrence Katzand Alan Krueger (1999) However an impre-cisely estimated time-varying NAIRU makesforecasting using the unemployment-basedPhillips curve problematic

A different reaction to this time variation inthe NAIRU has been to see if there are alter-native predictive relations that have beenmore stable Staiger Stock and Watson(1997a) considered 71 candidate leadingindicators of inflation both financial and non-financial (quarterly US) and in a similar butmore thorough exercise Stock and Watson(1999b) considered 167 candidate leadingindicators (monthly US) They found a fewindicators that have been stable predictors ofinflation the prime example being the capac-ity utilization rate Gordon (1998) and Stock(1998) confirmed the accuracy of recent USinflation forecasts based on the capacity uti-lization rate Stock and Watson (1999b) alsosuggested an alternative Phillips-curve-typeforecast based on a single aggregate activityindex computed using 85 individual measuresof real aggregate activity

800 Journal of Economic Literature Vol XLI (September 2003)

Recently Andrew Atkeson and LeeOhanian (2001) challenged the usefulness ofall inflation forecasts based on the Phillipscurve and its variants They showed that forthe United States from 1984 to 2001 pub-lished inflation forecasts and pseudo out-of-sample Phillips curve forecasts did not beata seasonal random walk forecast of annualinflation Their finding poses a significantchallenge to all attempts to forecast inflationand we return to it in our empirical analysis

The international evidence on the suitabil-ity of output gaps and the Phillips Curve forforecasting inflation is mixed Simple unem-ployment-based models with a constantNAIRU fail in Europe which is one way tostate the phenomenon of so-called hysteresisin the unemployment rate More sophisti-cated and flexible statistical tools for esti-mating the NAIRU can improve in-samplefits for the European data (Thomas Laubach2001) but their value for forecasting is ques-tionable because of imprecision in the esti-mated NAIRU at the end of the sampleSimilarly inflation forecasts based on an out-put gap rather than the unemployment rateface the practical problem of estimating thegap at the end of the sample which neces-sarily introduces a one-sided estimate andassociated imprecision Evidence inMassimiliano Marcellino Stock and Watson(2000) suggests that the ability of output gapmodels to forecast inflation in Europe ismore limited than in the United States

Finally there is evidence (from US data)that the inflation process itself as well as pre-dictive relations based on it is time-varyingWilliam Brainard and George Perry (2000)and Timothy Cogley and Thomas Sargent(2001 2002) suggested that the persistence ofUS inflation was high in the 1970s and 1980sbut subsequently declined although this con-clusion appears to be sensitive to the methodused to measure persistence (FredericPivetta and Ricardo Reis 2002) GeorgeAkerlof William Dickens and Perry (2000)provided a model based on near-rationalbehavior that motivates a nonlinear Phillips

curve which they interpreted as consistentwith the Brainard and Perry (2000) evidence

In a similar vein Stephen Cecchetti RitaChu and Charles Steindel (2000) performeda pseudo out-of-sample forecasting experi-ment on various candidate leading indicatorsof inflation from 1985 to 1998 in the UnitedStates including interest rates term anddefault spreads and several nonfinancialindicators They concluded that none ofthese indicators financial or nonfinancialreliably predicts inflation in bivariate fore-casting models and that there are very fewyears in which financial variables outperforma simple autoregression Because theyassessed performance on a year-by-yearbasis these findings have great samplingvariability and it is difficult to know howmuch of this is due to true instability Stilltheir findings are consistent with Stock andWatsonrsquos (1996) results based on formal sta-bility tests that time variation in thesereduced form bivariate predictive relationsis widespread in the US data

33 Discussion

An econometrician might quibble withsome aspects of this literature Many ofthese studies fail to include lagged endoge-nous variables and thus do not assess mar-ginal predictive content Results oftenchange when marginal predictive content isconsidered (the predictive content of theterm spread for inflation is an example)Many of the regressions involve overlappingreturns and when the overlap period is largerelative to the sample size the distribution ofin-sample t-statistics and R2s becomes non-standard Some regressors such as the divi-dend yield and the term spread are highlypersistent and even if they do not have aunit root this persistence causes conventionalinference methods to break down These latter two problems combined make it evenmore difficult to do reliable inference andvery few of these studies mention far lesstackle either of these difficulties with their

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 11: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

798 Journal of Economic Literature Vol XLI (September 2003)

evidence for predictive content was substan-tially weaker These findings are consistentwith the predictive content of stock marketvolatility being stronger during someepisodes than during others

Few studies have examined the predictivecontent of stock prices for inflation One isCharles Goodhart and Boris Hofmann(2000a) who found that stock returns do nothave marginal predictive content for infla-tion in their international data set (twelvedeveloped economies quarterly data mainly1970ndash98 or shorter)

Other financial indicators Exchangerates are a channel through which inflationcan be imported in open economies In theUnited States exchange rates (or a measureof the terms of trade) have long entered con-ventional Phillips curves Robert Gordon(1982 1998) finds these exchange rates sta-tistically significant based on in-sample testsIn their international dataset howeverGoodhart and Hofmann (2000b) find thatpseudo out-of-sample forecasts of inflationusing exchange rates and lagged inflationoutperformed autoregressive forecasts inonly one or two of their seventeen countriesdepending on the horizon At least in theUS data there is also little evidence thatexchange rates predict output growth (egStock and Watson 1999a)

One problem with the nominal termstructure as a predictor of inflation is thatunder the expectations hypothesis the for-ward rate embodies forecasts of both infla-tion and future real rates In theory one caneliminate the expected future real rates byusing spreads between forward rates in theterm structures of nominal and real debt ofmatched maturity and matched bearer riskIn practice one of the few cases for whichthis is possible with time series of a reason-able length is for British index-linked bondsDavid Barr and Campbell (1997) investi-gated the (bivariate in-sample) predictivecontent of these implicit inflation expecta-tions and found that they had better predic-tive content for inflation than forward rates

obtained solely from the nominal term struc-ture They provided no evidence on Grangercausality or marginal predictive content ofthese implicit inflation expectations in multi-variate regressions

Lettau and Sydney Ludvigson (2001) pro-posed a novel indicator the log of the consumption-wealth ratio They argue thatin a representative consumer model with nostickiness in consumption the log ratio ofconsumption to total wealth (human andnonhuman) should predict the return on themarket portfolio They find empirically thattheir version of the consumptionndashwealthratio (a cointegrating residual between con-sumption of nondurables financial wealthand labor income all in logarithms) has pre-dictive content for multiyear stock returns(both real returns and excess returns) Ifconsumption is sticky it could also have pre-dictive content for consumption growthHowever Ludvigson and Charles Steindel(1999) and Lettau and Ludvigson (2000)find that this indicator does not predict con-sumption growth or income growth in theUnited States one quarter ahead

Housing constitutes a large component ofaggregate wealth and gets significant weightin the CPI in many countries More gener-ally housing is a volatile and cyclically sensi-tive sector and measures of real activity inthe housing sector are known to be usefulleading indicators of economic activity atleast in the United States (Stock and Watson1989 1999a) suggesting a broader channelby which housing prices might forecast realactivity inflation or both In the UnitedStates housing starts (a real quantity mea-sure) have some predictive content for infla-tion (Stock 1998 Stock and Watson 1999b)Studies of the predictive content of housingprices confront difficult data problems how-ever Goodhart and Hofmann (2000a) con-structed a housing price data set for twelveOECD countries (extended to seventeencountries in Goodhart and Hofmann(2000b) They found that residential housinginflation has significant in-sample marginal

Stock and Watson Forecasting Output and Inflation 799

predictive content for overall inflation in afew of the several countries they studyalthough in several countries they usedinterpolated annual data which makes thoseresults difficult to assess

Nonlinear models The foregoing discus-sion has focused on statistical models inwhich the forecasts are linear functions of thepredictors even the recession predictionmodels estimated using probit regressionsare essentially linear in the sense that the pre-dictive index (the argument of the probitfunction) is a linear function of the predic-tors Might the problems of instability andepisodically poor predictive content stemfrom inherent nonlinearities in the forecast-ing relation The evidence on this proposi-tion is limited and mixed Ted Jaditz LeighRiddick and Chera Sayers (1998) examinelinear and nonlinear models of US indus-trial production using asset price predictorsand conclude that combined nonlinear fore-casts improve upon simple linear modelsadditionally Greg Tkacz (2001) reportsimprovements of nonlinear models over lin-ear models for forecasting Canadian GDPOn the other hand John Galbraith and Tkacz(2000) find limited international evidence ofnonlinearity in the outputndashterm spread rela-tion Similarly in their pseudo out-of-samplecomparison of VAR to multivariate neuralnetwork forecasts Norman Swanson andHalbert White (1997) concluded that the lin-ear forecasts generally performed better forvarious measures of US economic activityand inflation Swanson and White (1995)reached similar conclusions when forecastinginterest rates Given this limited evidence wecannot rule out the possibility that the rightnonlinear model will produce stable and reli-able forecasts of output and inflation usinginterest rates but this ldquorightrdquo nonlinearmodel has yet to be found

32 Forecasts Using Nonfinancial Variables

The literature on forecasting output andinflation with nonfinancial variables is mas-

sive see Stock and Watson (1999a) for anextensive review of the US evidence Thissection highlights a few recent studies onthis topic

The use of nonfinancial variables to forecastinflation has to a large extent focused onidentifying suitable measures of output gapsthat is estimating generalized Phillips curvesIn the United States the unemployment-based Phillips curve with a constant non-accelerating inflation rate of unemployment(NAIRU) has recently been unstable predict-ing accelerating inflation during a time thatinflation was in fact low and steady or fallingThis instability has been widely documentedsee for example Gordon (1997 1998) andDouglas Staiger Stock and Watson (1997ab2001) One reaction to this instability has beento suggest that the NAIRU was falling in theUnited States during the 1990s Mechanicallythis keeps the unemployment-based Phillipscurve on track and it makes sense in the con-text of changes in the US labor market and inthe economy generally cf Lawrence Katzand Alan Krueger (1999) However an impre-cisely estimated time-varying NAIRU makesforecasting using the unemployment-basedPhillips curve problematic

A different reaction to this time variation inthe NAIRU has been to see if there are alter-native predictive relations that have beenmore stable Staiger Stock and Watson(1997a) considered 71 candidate leadingindicators of inflation both financial and non-financial (quarterly US) and in a similar butmore thorough exercise Stock and Watson(1999b) considered 167 candidate leadingindicators (monthly US) They found a fewindicators that have been stable predictors ofinflation the prime example being the capac-ity utilization rate Gordon (1998) and Stock(1998) confirmed the accuracy of recent USinflation forecasts based on the capacity uti-lization rate Stock and Watson (1999b) alsosuggested an alternative Phillips-curve-typeforecast based on a single aggregate activityindex computed using 85 individual measuresof real aggregate activity

800 Journal of Economic Literature Vol XLI (September 2003)

Recently Andrew Atkeson and LeeOhanian (2001) challenged the usefulness ofall inflation forecasts based on the Phillipscurve and its variants They showed that forthe United States from 1984 to 2001 pub-lished inflation forecasts and pseudo out-of-sample Phillips curve forecasts did not beata seasonal random walk forecast of annualinflation Their finding poses a significantchallenge to all attempts to forecast inflationand we return to it in our empirical analysis

The international evidence on the suitabil-ity of output gaps and the Phillips Curve forforecasting inflation is mixed Simple unem-ployment-based models with a constantNAIRU fail in Europe which is one way tostate the phenomenon of so-called hysteresisin the unemployment rate More sophisti-cated and flexible statistical tools for esti-mating the NAIRU can improve in-samplefits for the European data (Thomas Laubach2001) but their value for forecasting is ques-tionable because of imprecision in the esti-mated NAIRU at the end of the sampleSimilarly inflation forecasts based on an out-put gap rather than the unemployment rateface the practical problem of estimating thegap at the end of the sample which neces-sarily introduces a one-sided estimate andassociated imprecision Evidence inMassimiliano Marcellino Stock and Watson(2000) suggests that the ability of output gapmodels to forecast inflation in Europe ismore limited than in the United States

Finally there is evidence (from US data)that the inflation process itself as well as pre-dictive relations based on it is time-varyingWilliam Brainard and George Perry (2000)and Timothy Cogley and Thomas Sargent(2001 2002) suggested that the persistence ofUS inflation was high in the 1970s and 1980sbut subsequently declined although this con-clusion appears to be sensitive to the methodused to measure persistence (FredericPivetta and Ricardo Reis 2002) GeorgeAkerlof William Dickens and Perry (2000)provided a model based on near-rationalbehavior that motivates a nonlinear Phillips

curve which they interpreted as consistentwith the Brainard and Perry (2000) evidence

In a similar vein Stephen Cecchetti RitaChu and Charles Steindel (2000) performeda pseudo out-of-sample forecasting experi-ment on various candidate leading indicatorsof inflation from 1985 to 1998 in the UnitedStates including interest rates term anddefault spreads and several nonfinancialindicators They concluded that none ofthese indicators financial or nonfinancialreliably predicts inflation in bivariate fore-casting models and that there are very fewyears in which financial variables outperforma simple autoregression Because theyassessed performance on a year-by-yearbasis these findings have great samplingvariability and it is difficult to know howmuch of this is due to true instability Stilltheir findings are consistent with Stock andWatsonrsquos (1996) results based on formal sta-bility tests that time variation in thesereduced form bivariate predictive relationsis widespread in the US data

33 Discussion

An econometrician might quibble withsome aspects of this literature Many ofthese studies fail to include lagged endoge-nous variables and thus do not assess mar-ginal predictive content Results oftenchange when marginal predictive content isconsidered (the predictive content of theterm spread for inflation is an example)Many of the regressions involve overlappingreturns and when the overlap period is largerelative to the sample size the distribution ofin-sample t-statistics and R2s becomes non-standard Some regressors such as the divi-dend yield and the term spread are highlypersistent and even if they do not have aunit root this persistence causes conventionalinference methods to break down These latter two problems combined make it evenmore difficult to do reliable inference andvery few of these studies mention far lesstackle either of these difficulties with their

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 12: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 799

predictive content for overall inflation in afew of the several countries they studyalthough in several countries they usedinterpolated annual data which makes thoseresults difficult to assess

Nonlinear models The foregoing discus-sion has focused on statistical models inwhich the forecasts are linear functions of thepredictors even the recession predictionmodels estimated using probit regressionsare essentially linear in the sense that the pre-dictive index (the argument of the probitfunction) is a linear function of the predic-tors Might the problems of instability andepisodically poor predictive content stemfrom inherent nonlinearities in the forecast-ing relation The evidence on this proposi-tion is limited and mixed Ted Jaditz LeighRiddick and Chera Sayers (1998) examinelinear and nonlinear models of US indus-trial production using asset price predictorsand conclude that combined nonlinear fore-casts improve upon simple linear modelsadditionally Greg Tkacz (2001) reportsimprovements of nonlinear models over lin-ear models for forecasting Canadian GDPOn the other hand John Galbraith and Tkacz(2000) find limited international evidence ofnonlinearity in the outputndashterm spread rela-tion Similarly in their pseudo out-of-samplecomparison of VAR to multivariate neuralnetwork forecasts Norman Swanson andHalbert White (1997) concluded that the lin-ear forecasts generally performed better forvarious measures of US economic activityand inflation Swanson and White (1995)reached similar conclusions when forecastinginterest rates Given this limited evidence wecannot rule out the possibility that the rightnonlinear model will produce stable and reli-able forecasts of output and inflation usinginterest rates but this ldquorightrdquo nonlinearmodel has yet to be found

32 Forecasts Using Nonfinancial Variables

The literature on forecasting output andinflation with nonfinancial variables is mas-

sive see Stock and Watson (1999a) for anextensive review of the US evidence Thissection highlights a few recent studies onthis topic

The use of nonfinancial variables to forecastinflation has to a large extent focused onidentifying suitable measures of output gapsthat is estimating generalized Phillips curvesIn the United States the unemployment-based Phillips curve with a constant non-accelerating inflation rate of unemployment(NAIRU) has recently been unstable predict-ing accelerating inflation during a time thatinflation was in fact low and steady or fallingThis instability has been widely documentedsee for example Gordon (1997 1998) andDouglas Staiger Stock and Watson (1997ab2001) One reaction to this instability has beento suggest that the NAIRU was falling in theUnited States during the 1990s Mechanicallythis keeps the unemployment-based Phillipscurve on track and it makes sense in the con-text of changes in the US labor market and inthe economy generally cf Lawrence Katzand Alan Krueger (1999) However an impre-cisely estimated time-varying NAIRU makesforecasting using the unemployment-basedPhillips curve problematic

A different reaction to this time variation inthe NAIRU has been to see if there are alter-native predictive relations that have beenmore stable Staiger Stock and Watson(1997a) considered 71 candidate leadingindicators of inflation both financial and non-financial (quarterly US) and in a similar butmore thorough exercise Stock and Watson(1999b) considered 167 candidate leadingindicators (monthly US) They found a fewindicators that have been stable predictors ofinflation the prime example being the capac-ity utilization rate Gordon (1998) and Stock(1998) confirmed the accuracy of recent USinflation forecasts based on the capacity uti-lization rate Stock and Watson (1999b) alsosuggested an alternative Phillips-curve-typeforecast based on a single aggregate activityindex computed using 85 individual measuresof real aggregate activity

800 Journal of Economic Literature Vol XLI (September 2003)

Recently Andrew Atkeson and LeeOhanian (2001) challenged the usefulness ofall inflation forecasts based on the Phillipscurve and its variants They showed that forthe United States from 1984 to 2001 pub-lished inflation forecasts and pseudo out-of-sample Phillips curve forecasts did not beata seasonal random walk forecast of annualinflation Their finding poses a significantchallenge to all attempts to forecast inflationand we return to it in our empirical analysis

The international evidence on the suitabil-ity of output gaps and the Phillips Curve forforecasting inflation is mixed Simple unem-ployment-based models with a constantNAIRU fail in Europe which is one way tostate the phenomenon of so-called hysteresisin the unemployment rate More sophisti-cated and flexible statistical tools for esti-mating the NAIRU can improve in-samplefits for the European data (Thomas Laubach2001) but their value for forecasting is ques-tionable because of imprecision in the esti-mated NAIRU at the end of the sampleSimilarly inflation forecasts based on an out-put gap rather than the unemployment rateface the practical problem of estimating thegap at the end of the sample which neces-sarily introduces a one-sided estimate andassociated imprecision Evidence inMassimiliano Marcellino Stock and Watson(2000) suggests that the ability of output gapmodels to forecast inflation in Europe ismore limited than in the United States

Finally there is evidence (from US data)that the inflation process itself as well as pre-dictive relations based on it is time-varyingWilliam Brainard and George Perry (2000)and Timothy Cogley and Thomas Sargent(2001 2002) suggested that the persistence ofUS inflation was high in the 1970s and 1980sbut subsequently declined although this con-clusion appears to be sensitive to the methodused to measure persistence (FredericPivetta and Ricardo Reis 2002) GeorgeAkerlof William Dickens and Perry (2000)provided a model based on near-rationalbehavior that motivates a nonlinear Phillips

curve which they interpreted as consistentwith the Brainard and Perry (2000) evidence

In a similar vein Stephen Cecchetti RitaChu and Charles Steindel (2000) performeda pseudo out-of-sample forecasting experi-ment on various candidate leading indicatorsof inflation from 1985 to 1998 in the UnitedStates including interest rates term anddefault spreads and several nonfinancialindicators They concluded that none ofthese indicators financial or nonfinancialreliably predicts inflation in bivariate fore-casting models and that there are very fewyears in which financial variables outperforma simple autoregression Because theyassessed performance on a year-by-yearbasis these findings have great samplingvariability and it is difficult to know howmuch of this is due to true instability Stilltheir findings are consistent with Stock andWatsonrsquos (1996) results based on formal sta-bility tests that time variation in thesereduced form bivariate predictive relationsis widespread in the US data

33 Discussion

An econometrician might quibble withsome aspects of this literature Many ofthese studies fail to include lagged endoge-nous variables and thus do not assess mar-ginal predictive content Results oftenchange when marginal predictive content isconsidered (the predictive content of theterm spread for inflation is an example)Many of the regressions involve overlappingreturns and when the overlap period is largerelative to the sample size the distribution ofin-sample t-statistics and R2s becomes non-standard Some regressors such as the divi-dend yield and the term spread are highlypersistent and even if they do not have aunit root this persistence causes conventionalinference methods to break down These latter two problems combined make it evenmore difficult to do reliable inference andvery few of these studies mention far lesstackle either of these difficulties with their

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 13: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

800 Journal of Economic Literature Vol XLI (September 2003)

Recently Andrew Atkeson and LeeOhanian (2001) challenged the usefulness ofall inflation forecasts based on the Phillipscurve and its variants They showed that forthe United States from 1984 to 2001 pub-lished inflation forecasts and pseudo out-of-sample Phillips curve forecasts did not beata seasonal random walk forecast of annualinflation Their finding poses a significantchallenge to all attempts to forecast inflationand we return to it in our empirical analysis

The international evidence on the suitabil-ity of output gaps and the Phillips Curve forforecasting inflation is mixed Simple unem-ployment-based models with a constantNAIRU fail in Europe which is one way tostate the phenomenon of so-called hysteresisin the unemployment rate More sophisti-cated and flexible statistical tools for esti-mating the NAIRU can improve in-samplefits for the European data (Thomas Laubach2001) but their value for forecasting is ques-tionable because of imprecision in the esti-mated NAIRU at the end of the sampleSimilarly inflation forecasts based on an out-put gap rather than the unemployment rateface the practical problem of estimating thegap at the end of the sample which neces-sarily introduces a one-sided estimate andassociated imprecision Evidence inMassimiliano Marcellino Stock and Watson(2000) suggests that the ability of output gapmodels to forecast inflation in Europe ismore limited than in the United States

Finally there is evidence (from US data)that the inflation process itself as well as pre-dictive relations based on it is time-varyingWilliam Brainard and George Perry (2000)and Timothy Cogley and Thomas Sargent(2001 2002) suggested that the persistence ofUS inflation was high in the 1970s and 1980sbut subsequently declined although this con-clusion appears to be sensitive to the methodused to measure persistence (FredericPivetta and Ricardo Reis 2002) GeorgeAkerlof William Dickens and Perry (2000)provided a model based on near-rationalbehavior that motivates a nonlinear Phillips

curve which they interpreted as consistentwith the Brainard and Perry (2000) evidence

In a similar vein Stephen Cecchetti RitaChu and Charles Steindel (2000) performeda pseudo out-of-sample forecasting experi-ment on various candidate leading indicatorsof inflation from 1985 to 1998 in the UnitedStates including interest rates term anddefault spreads and several nonfinancialindicators They concluded that none ofthese indicators financial or nonfinancialreliably predicts inflation in bivariate fore-casting models and that there are very fewyears in which financial variables outperforma simple autoregression Because theyassessed performance on a year-by-yearbasis these findings have great samplingvariability and it is difficult to know howmuch of this is due to true instability Stilltheir findings are consistent with Stock andWatsonrsquos (1996) results based on formal sta-bility tests that time variation in thesereduced form bivariate predictive relationsis widespread in the US data

33 Discussion

An econometrician might quibble withsome aspects of this literature Many ofthese studies fail to include lagged endoge-nous variables and thus do not assess mar-ginal predictive content Results oftenchange when marginal predictive content isconsidered (the predictive content of theterm spread for inflation is an example)Many of the regressions involve overlappingreturns and when the overlap period is largerelative to the sample size the distribution ofin-sample t-statistics and R2s becomes non-standard Some regressors such as the divi-dend yield and the term spread are highlypersistent and even if they do not have aunit root this persistence causes conventionalinference methods to break down These latter two problems combined make it evenmore difficult to do reliable inference andvery few of these studies mention far lesstackle either of these difficulties with their

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 14: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 801

in-sample regressions Instability is a majorfocus of some of these papers yet formaltests of stability are the exception Finallyalthough some of the papers pay close atten-tion to simulated forecasting performancepredictive content usually is assessedthrough in-sample fits that require constantparameters (stationarity) for external validity

Despite these shortcomings the literaturedoes suggest four general conclusions Firstthe variables with the clearest theoreticaljustification for use as predictors often havescant empirical predictive content Theexpectations hypothesis of the term struc-ture of interest rates suggests that the termspread should forecast inflation but it gen-erally does not once lagged inflation isincluded Stock prices and log dividendyields should reflect expectations of futurereal earnings but empirically they providepoor forecasts of real economic activityDefault spreads have the potential to pro-vide useful forecasts of real activity and attimes they have but the obvious default riskchannel appears not to be the relevant chan-nel by which these spreads have their pre-dictive content Moreover the particulars offorecasting with these spreads seem to hingeon the current institutional environment

Second there is evidence that the termspread is a serious candidate as a predictor ofoutput growth and recessions The stabilityof this proposition in the United States isquestionable however and its universality isunresolved

Third although only a limited amount ofinternational evidence on the performanceof generalized Phillips curve models wasreviewed above generalized Phillips curvesand output gaps appear to be one of the fewways to forecast inflation that have been reli-able But this too seems to depend on thetime and country

Fourth our reading of this literature sug-gests that many of these forecasting relationsare ephemeral The work on using assetprices as forecasting tools over the past fifteenyears was in part a response to disappoint-

ment over the failure of monetary aggregatesto provide reliable and stable forecasts or tobe useful indicators of monetary policy Theevidence of the 1990s on the term spread thepaperndashbill spread and on some of the othertheoretically suggested asset price predictorsrecalls the difficulties that arose when mone-tary aggregates were used to predict the tur-bulent late 1970s and 1980s the literature onforecasting using asset prices apparently hasencountered the very pitfalls that its partici-pants hoped to escape One complaint aboutforecasts based on monetary aggregates wasthe inability to measure money properly inpractice The results reviewed here suggest amore profound set of problems after allthese asset prices are measured well and inmany cases the underlying financial instru-ment (a broad-based stock portfolio short-term government debt) does not vary appre-ciably over time or even across countriesThese observations point to a deeper problemthan measurement that the underlying rela-tions themselves depend on economic poli-cies macroeconomic shocks and specificinstitutions and thus evolve in ways that aresufficiently complex that real-time forecast-ing confronts considerable model uncertainty

4 Forecasting Models and Statistics

We now turn to an empirical analysis ofthe predictive content of asset prices andother leading indicators for output growthand inflation using quarterly data from 1959to 1999 (as available) for seven OECD coun-tries Our purpose is to provide a systematicreplication extension and reappraisal of thefindings in the literature reviewed in section3 Real output is measured by real GDP andby the index of industrial production (IP)Inflation is measured by the percentagechange of the consumer price index (CPI)or its counterpart and of the implicit GDPdeflator (PGDP) This section extends sec-tion 2 and discusses additional empiricalmethods The data are discussed in section 5and results are presented in sections 6 and 7

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 15: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

802 Journal of Economic Literature Vol XLI (September 2003)

41 Forecasting Models

The analysis uses h-step ahead linearregression models of the form (3) In addi-tion we examine the predictive content of Xtfor after controlling for past values ofYt and past values of a third predictor Zt forexample we examine whether the predictiveperformance of a backward-looking Phillipscurve is improved by adding an asset priceThis is done by augmenting (3) to includelags of Zt

5 b0 1 b1(L)Xt 1 b2(L)Yt

1 b3(L)Zt 1 (5)

The dependent variables are transformedto eliminate stochastic and deterministictrends The logarithm of output is alwaystreated as integrated of order one (I(1)) sothat Yt is the quarterly rate of growth of out-put at an annual rate Because there is ambi-guity about whether the logarithm of pricesis best modeled as being I(1) or I(2) theempirical analysis was conducted using bothtransformations The out-of-sample fore-casts proved to be more accurate for the I(2)transformation so to save space we presentonly those results Thus for the aggregateprice indices Yt is the first difference of thequarterly rate of inflation at an annual rateTransformations of the predictors are dis-cussed in the next section

Definitions of The multistep fore-casts examine the predictability of the loga-rithm of the level of the variable of interestafter imposing the I(1) or I(2) constraint Foroutput we consider cumulative growth at anannual percentage rate of output over the hperiods so 5 (400h)ln(Qt1hQt)where Qt denotes the level of the real outputseries For prices we consider the h-periodrate of inflation (400h)ln(Pt1hPt) where Ptis the price level upon imposing the I(2)constraint this yields the dependent variable

5 (400h)ln(Pt1hPt) 2 400ln(PtPt21)Lag lengths and estimation To make the

results comparable across series and country

Yht 1 h

Yht 1 h

Yht 1 h

uht 1h

Yht 1 h

Yht 1 h

for the in-sample analysis we use a fixed laglength of four (so that the regressors in (3) areXthellip Xt23 Ythellip Yt23) For the pseudo out-of-sample analysis the lag length is data-dependentmdashspecifically chosen using theAICmdashso that the model could adapt to poten-tially different dynamics across countries andover time For the univariate forecasts theAIC-determined lag length was restricted tobe between zero and four For the bivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags of Xt were considered For the trivariateforecasts between zero and four lags of Ytwere considered and between one and fourlags each of Xt and Zt were considered

42 Model Comparison Statistics

For each forecasting model we computedboth in-sample and pseudo out-of-samplestatistics

In-sample statistics The in-sample sta-tistics are the heteroskedasticity-robustGranger-causality test statistic computed ina 1-step ahead regression (h 5 1 in (3) and(5)) and the Richard Quandt (1960) likeli-hood ratio (QLR) test for coefficient stabil-ity computed over all possible break dates inthe central 70 percent of the sample

The QLR statistic tests the null hypothesisof constant regression coefficients againstthe alternative that the regression coef-ficients change over time Our version of the QLR statistic also known as the sup-Wald statistic entails computing the heteroskedasticity-robust Wald statistic test-ing for a break in the coefficients at a knowndate then taking the maximum of those sta-tistics over all possible break dates in thecentral 70 percent of the sample Althoughthis statistic is designed to detect a break ata single date it has good power against otherforms of parameter instability includingslowly drifting parameters (Stock andWatson 1998) The asymptotic null distribu-tion of this statistic was derived by DonaldAndrews (1993) (a corrected table of critical

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 16: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 803

values is provided in Stock and Watson2003a table 125)

Two versions of the QLR statistic werecomputed The first tests only for changes inthe constant term and the coefficients on Xtand its lags that is for a break in b0 andb1(L) in (3) and (5) under the maintainedhypothesis that the remaining coefficientsare constant The second tests for changes inall of the coefficients The qualitative resultswere the same for both statistics To savespace we report results for the first test only

Pseudo out-of-sample statistics The pseu-do out-of-sample statistics are based on fore-casts computed for each model horizon andseries being forecasted The model estima-tion and selection is recursive (uses all avail-able prior data) as the forecasting exerciseproceeds through time We computed thesample relative MSFE defined in (4) rela-tive to the AR benchmark where both mod-els have recursive AIC lag length selectionFor most series the out-of-sample forecast-ing exercise begins in the first quarter of1971 and continues through the end of thesample period For variables available from1959 onward this means that the first fore-cast is based on approximately ten years ofdata after accounting for differencing andinitial conditions For variables with laterstart dates the out of sample forecast periodbegins after accumulating ten years of dataThe out of sample period is divided into twosub-periods 1971ndash84 and 1985ndash99 Theseperiods are of equal length for the four-quarter ahead forecasts Because the modelsare nested tests of the hypothesis that thepopulation relative MSFE is one against thealternative that it is less than one are con-ducted using the asymptotic null distributionderived by Clark and McCracken (2001)

5 Data

We collected data on up to 26 series foreach country from 1959 to 1999 although forsome countries certain series were eitherunavailable or were available only for a

shorter period Data were obtained from fourmain sources the International MonetaryFund IFS database the OECD database theDRI Basic Economics Database and theDRI International Database Additionalseries including spreads real asset prices andex-ante real interest rates were constructedfrom these original 26 series bringing thetotal number of series to 43 These 43 seriesare listed in table 1 The dates over whicheach series is available are listed in the appen-dix The data were subject to five possibletransformations done in the following order

First a few of the series contained largeoutliers such as spikes associated withstrikes variable re-definitions etc (thoseseries and outlier dates are listed in theAppendix) Those outliers were replacedwith an interpolated value constructed as themedian of the values within three periods oneither side of the outlier

Second many of the data showed signifi-cant seasonal variation and these serieswere seasonally adjusted Seasonal variationwas determined by a pre-test (regressing anappropriately differenced version of theseries on a set of seasonal dummies) carriedout at the 10-percent level Seasonal adjust-ment was carried out using a linear approxi-mation to X11 (Kenneth Wallisrsquos 1974 formonthly series and Guy Larocquersquos 1977 forquarterly series) with endpoints calculatedusing autoregressive forecasts and backcasts

Third when the data were available on amonthly basis the data were aggregated toquarterly observations For the index of indus-trial production and the CPI (the variablesbeing forecast) quarterly aggregates wereformed as averages of the monthly values Forall other series the last monthly value of thequarter was used as the quarterly value

Fourth in some cases the data were trans-formed by taking logarithms

Fifth the highly persistent or trendingvariables were differenced second differ-enced or computed as a ldquogaprdquo that is adeviation from a stochastic trend Becausethe variables are being used for forecasting

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 17: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

TABLE 1SERIES DESCRIPTIONS

Series Label Sampling Frequency Description

Asset Prices

rovnght M Interest rate overnight rtbill M Interest rate short-term govrsquot bills rbnds M Interest rate short-term govrsquot bonds rbndm M Interest rate medium-term govrsquot bonds rbndl M Interest rate long-term govrsquot bonds rrovnght Q Real overnight rate rovnghtndashCPI inflation rrtbill Q Real short-term bill rate rtbillndashCPI inflation rrbnds Q Real short-term bond rate rtbndsndashCPI inflation rrbndm Q Real medium-term bond rate rtbndmndashCPI inflation rrbndl Q Real long-term bond rate rtbndlndashCPI inflation rspread M Term spread rbndlndashrovnght exrate M Nominal exchange rate rexrate M Real exchange rate (exrate 3 relative CPIs) stockp M Stock price index rstockp M Real stock price index stockp divpr Q Dividend price index house Q Housing price index rhouse Q Real housing price index gold M Gold price rgold M Real gold price silver M Silver price rsilver M Real silver price

Activity

rgdp Q Real GDP ip M Index of industrial production capu MampQ Capacity utilization rate emp MampQ Employment unemp MampQ Unemployment rate

Wages Goods and Commodity Prices

pgdp Q GDP deflator cpi M Consumer price index ppi M Producer price index earn M Wages commod M Commodity price index oil M Oil price roil M Real oil prices rcommod M Real commodity price index

Money

m0 M Money M0 or monetary base m1 M Money M1 m2 M Money M2 m3 M Money M3 rm0 M Real money M0 rm1 M Real money M1 rm2 M Real money M2 rm3 M Real money M3

Notes M indicates that the original data are monthly Q indicates that they are quarterly MampQ indicates thatmonthly data were available for some countries but quarterly data were available for others All forecasts andregressions use quarterly data which were aggregated from monthly data by averaging (for CPI and IP) or byusing the last monthly value (all other series) Additional details are given in the appendix

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 18: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 805

4 Because real interest rates are formed by subtractingCPI inflation for the current quarter and because inflationis modeled in second differences the CPI inflation fore-cast based on the regression (3) where Xt is the first dif-ference of nominal interest rates is the same as the CPIinflation forecast where Xt is the first difference of realinterest rates as long as the number of lags is the same inthe two regressions and the number of interest rate lags isnot more than the number of inflation lags But becausethese two lag lengths are selected recursively by BICthese conditions on the lags need not hold so the relativeMSFEs for CPI inflation forecasts based on nominal andreal interest rates can (and do) differ

5 Full empirical results are available at httpwwwwwsprincetonedu~mwatson

the gaps were computed in a way that pre-served the temporal ordering Specificallythe gaps here were estimated using a one-sided version of the filter proposed byRobert Hodrick and Edward Prescott(1981) details of which are given in theappendix

For some variables such as interest ratesit is unclear whether they should be in-cluded in levels or after first differencing sofor these variables we consider both ver-sions In all this results in a maximum 73candidate predictors per country for each ofthe inflation and output growth forecasts

6 Results for Models with IndividualIndicators

This section summarizes the empiricalresults for forecasts of inflation and outputgrowth using individual predictors Fore-casts were made for two- four- and eight-step ahead inflation and output growth (h 52 4 and 8 in (3) and (5)) Among the bivari-ate models (for which there is no Z variablein (5)) there are a total of 6123 potentialpairs of predictor and dependent variableover the three horizons and seven countriesof these we have at least some empiricalresults for 5080 cases4 To save space wefocus on four-quarter ahead forecasts of CPIinflation and GDP growth A full set ofresults for all horizons and dependent vari-ables is available in the results supplementwhich is available on the web5

61 Forecasts of Inflation

The performance of the various individualindicators relative to the autoregressivebenchmark is summarized in table 2 forfour-quarter ahead forecasts of CPI infla-tion The first row provides the root meansquared forecast error of the pseudo out-of-sample benchmark univariate autoregressiveforecasts in the two sample periods For thesubsequent rows each cell corresponds toan indicatorcountry pair where the twoentries are for the two sample periods Thesecond and third rows report the relativeMSFEs of the no-change (random walk)forecast and of the seasonal no-change fore-cast which is the Atkeson-Ohanian (2001)forecast at a quarterly sampling frequency

Inspection of table 2 reveals that somevariables forecast relatively well in somecountries in one or the other subsamplesFor example the inflation forecast based onthe nominal short rate has a relative MSFEof 068 in the first subsample in France indi-cating a 32-percent improvement in thisperiod relative to the benchmark autoregres-sion in Japan and the United Kingdom stockprices produce a relative MSFE of 86 and85 in the first period Real activity measuresare also useful in some countryvariablesub-sample cases for example the capacity uti-lization rate works well for the United Statesduring both subsamples and M2 predictedinflation well for Germany in the first period

These forecasting successes however areisolated and sporadic For example housingprice inflation predicts CPI inflation in thefirst period in the United States but it per-forms substantially worse than the AR bench-mark in the second period in the UnitedStates and in the other countries The shortrate works well in France in the first periodbut quite poorly in the second The rate ofincrease of the price of gold occasionally is auseful predictor Monetary aggregates rarelyimprove upon the AR model except for M2and real M2 in the first period for GermanyCommodity price inflation works well in the

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 19: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

TABLE 2PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS1971ndash84 AND 1985ndash99 CPI INFLATION 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 210 167 237 102 128 173 465 138 495 132 525 206 250 128Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)2pt 5 t 092 120 097 109 117 174 094 089 077 191 097 114 092 120(1-L4)2pt 5 t 117 076 104 106 097 085 099 103 090 092 091 101 119 079

Bivariate Forecasts MSFE Relative to Univariate Autoregressionrovnght level 112 068 147 101 103 276 101 203 098 099 107rtbill level 108 107 212 115 180 171 090 092 103rbnds level 186 090 099 103rbndm level 165 094 101 096rbndl level 124 099 126 100 082 111 137 102 534 108 100 106 098rovnght D 103 107 105 099 097 210 100 097 115 105 099rtbill D 103 099 100 102 112 107 092 113 098rbnds D 104 090 102 099rbndm D 116 153 102 118rbndl D 127 098 107 105 094 102 120 115 241 106 109 098 117rrovnght level 106 116 097 121 097 146 154 130 143 130 107rrtbill level 149 101 122 124 116 089 174 138 096rrbnds level 096 148 127 098rrbndm level 126 162 135 111rrbndl level 124 093 130 148 112 082 133 174 126 096 134 132 112rrovnght D 089 106 087 099 097 092 101 103 118 114 098rrtbill D 104 087 098 110 108 088 097 110 097rrbnds D 090 088 099 098rrbndm D 115 106 096 102rrbndl D 118 088 092 104 093 104 118 116 117 100 106 097 105rspread level 107 110 146 113 101 255 124 106 091 140exrate Dln 098 124 121 103 177 123 212rexrate Dln 093 132 108 092 188 106 212stockp Dln 099 112 118 101 102 101 135 107 086 264 085 121 095 120rstockp Dln 100 114 111 101 101 101 126 114 083 283 093 117 094 122divpr ln 154 196 120 105 433 201 109 122house Dln 116 660 097 086 111rhouse ln 126 453 202 091 111rhouse Dln 120 391 108 070 104gold Dln 102 095 106 091 119 099 114 095 202 093 090 092 143 103gold D2ln 130 101 100 099 105 100 095 101 101 099 103 100 102 110rgold ln 119 093 203 098 116 102 154 105 151 126 118 100 220 093rgold Dln 094 091 124 092 117 098 106 118 167 089 094 092 131 090silver Dln 106 109 106 108 111 096 113silver D2ln 105 103 106 098 117 108 117rsilver ln 111 165 111 293 247 145 139rsilver Dln 101 110 106 115 109 095 112rgdp Dln 100 089 100 094 099 090 089 103 177 104 091 082 084rgdp gap 099 084 130 082 094 092 113 107 084 106 088 085 094ip Dln 099 084 100 104 095 107 094 081 095 143 105 096 083 086ip gap 100 091 100 107 085 106 098 116 105 100 092 091 077 097capu level 103 070 221 103 196 255 074 080emp Dln 098 095 163 081 106 100 186 085 090 074 089

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 20: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

TABLE 2 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99emp gap 089 078 253 082 105 104 119 091 138 065 104unemp level 116 081 369 102 098 115 130 119 232 106 088 076 089unemp Dln 097 091 083 082 098 101 126 098 206 089 114 078 097unemp gap 092 076 114 083 096 108 113 113 117 088 088 075 102pgdp Dln 108 102 236 100 099 110 102 116 149 099 119 106 108pgdp D2ln 102 100 102 098 100 103 099 098 110 099 101 100 098ppi Dln 136 092 182 098 134 139 103 102 122 091ppi D2ln 100 091 106 098 075 129 090 099 105 093earn Dln 109 103 107 111 103 095 129 105 128 095 110 103earn D2ln 103 100 100 099 099 100 103 103 105 099 100 099oil Dln 116 093 204 101 123 100 091 162 240 147 099 108 109 099oil D2ln 122 096 149 099 129 098 092 160 097 098 105 100 103 089roil ln 157 095 114 071 110 099 178 096 144 177 111 166 281 086roil Dln 111 092 189 104 105 100 108 147 206 123 095 138 101 099comod Dln 112 091 120 102 105 099 103 097 136 198 102 084 079 126comod D2ln 100 101 113 134 102 100 099 148 105 206 105 100 099 164rcomod ln 123 089 128 112 121 114 108 138 113 226 096 160 079 144rcomod Dln 103 085 114 107 103 097 090 118 097 205 094 082 068 134m0 Dln 149 105 112m0 D2ln 281 100 105m1 Dln 128 103 123 106 108 095 096 131 139 095 120m1 D2ln 109 102 123 101 105 101 094 103 132 101 105m2 Dln 124 075 122 237 320 104 101m2 D2ln 130 099 105 161 178 102 103m3 Dln 124 099 097 108 101 090 317 103 102m3 D2ln 118 098 100 103 107 094 309 100 096rm0 Dln 271 080 139rm1 Dln 114 112 179 105 101 090 102 136 144 083 165rm2 Dln 123 065 122 232 273 097 095rm3 Dln 130 092 107 095 092 145 220 089 113

Notes The two entries in each cell are results for first and second out-of-sample forecast periods (1971ndash84 and1985ndash99) The first row shows the root mean square forecast error for the univariate autoregression All otherentries are the mean square forecast errors (MSFE) relative to the MSFE for the univariate autoregression Thesecond and third row present relative MSFEs for alternative univariate benchmarks the no-change forecast ofinflation (inflation follows a random walk) and the seasonal (four-lag) no-change forecast of inflation For theentries labeled Bivariate Forecasts the first column lists the indicator and the second column lists the transfor-mation used for the indicator Let St denote the original series and Xt denote the series used in the regression(3) The transformations are

level Xt 5 StD Xt 5 St 2 St21ln Xt 5 lnStDln Xt 5 lnSt 2 lnSt21D2ln Xt 5 (lnSt 2 lnSt21) 2 (lnSt21 2 lnSt22)

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 21: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

808 Journal of Economic Literature Vol XLI (September 2003)

United States in the first period but not inthe second in Canada it works well in thesecond period but not in the first and insome countryperiod combinations it worksmuch worse than the AR benchmark Thisinstability is also evident in the two otherunivariate forecasts the no-change and sea-sonal no-change For example the seasonalno-change forecast works well in the UnitedStates in the second period but poorly in thefirst a similar pattern as in Canada (but theopposite pattern as in the United Kingdom)

The only set of predictors that usuallyimproves upon the AR forecasts is the mea-sures of aggregate activity For example theIP and unemployment gaps both improveupon the AR (or are little worse than theAR) for both periods for Canada Germanyand the United States Even for these pre-dictors however the improvement is neitheruniversal nor always stable

One possible explanation for this apparentinstability is that it is a statistical artifact afterall these relative MSFEs have a samplingdistribution so there is sampling uncertainty(estimation error) associated with the esti-mates in table 2 To examine this possibilitywe used the results in Clark and McCracken(2001) to test the hypothesis that the relativeMSFE is one against the alternative that it isless than one The Clark-McCracken (2001)null distribution of the relative MSFE cannotbe computed for the statistics in table 2because the number of lags in the modelschanges over time so instead we computedit for pseudo out-of-sample forecasts frommodels with fixed lag lengths of four (com-plete results are available in the web resultssupplement) With fixed lag lengths the5-percent critical value ranges from 092 to096 (the null distribution depends on consis-tently estimable nuisance parameters so itwas computed by simulation on a series-by-series basis yielding series-specific criticalvalues) Most of the improvements over theAR model are statistically significant In thissense it appears that the observed temporalinstability of the MSFEs is not a conse-

quence of sampling variability alone for seriesthat in population have no predictive content

62 Forecasts of Output Growth

Table 3 which has the same format astable 2 summarizes the performance of theindividual forecasts of real GDP growth atthe four quarter horizon (results for the otherhorizons for GDP and for all horizons for IPare given in the results supplement describedin footnote 5) The results in table 3 are con-sistent with the literature surveyed in section3 The forecasts based on the term spread areof particular interest given their prominencein that literature In the United States theseforecasts improve upon the AR benchmarkin the first period but in the second periodthey are much worse than the AR forecasts(the relative MSFE is 048 in the first periodbut 251 in the second) This is consistentwith the literature reviewed in section 31which found a deterioration of the forecast-ing performance of the term spread as a pre-dictor of output growth in the United Statessince 1985 In Germany the term spread isuseful in the first period but not in the sec-ond (the relative MSFEs are 051 and 109)The cross-country evidence on usefulness isalso mixed the term spread beats the ARbenchmark in the second period in Canadaand Japan but not in France Germany Italythe United Kingdom or the United States

The picture for other asset prices is simi-lar The level of the short rate is a useful pre-dictor in Japan in the second period but notthe first and in Germany and the UnitedStates in the first period but not the secondThe nominal exchange rate outperforms theAR benchmark in the second period inCanada Germany Italy and Japan but notin France the United Kingdom or theUnited States Real stock returns improveupon the AR benchmark in the first periodbut not the second in Canada GermanyJapan and the United States

Predictors that are not asset prices fare nobetter or even worse Forecasts based on

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 22: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

TABLE 3PSEUDO OUT-OF-SAMPLE MEAN SQUARE FORECAST ERRORS

1971ndash84 AND 1985ndash99 REAL GDP GROWTH 4 QUARTERS AHEAD

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99Root Mean Square Forecast Error

Univ Autoregression 291 255 190 156 283 184 347 188 359 246 296 189 319 131Univariate Forecasts MSFE Relative to Univariate Autoregression

(1-L)yt = a + laquot 097 099 113 104 104 105 137 151 288 103 098 098 109Bivariate Forecasts MSFE Relative to Univariate Autoregression

rovnght level 074 157 030 148 148 121 089 113 078 142rtbill level 059 072 163 128 086 119 079 085 106rbnds level 092 087 092 129rbndm level 156 140 111 147rbndl level 080 094 159 046 212 116 155 088 100 096 118 166rovnght D 068 091 109 117 085 105 098 121 111 157rtbill D 105 103 098 137 048 124 111 132 163rbnds D 059 104 119 185rbndm D 111 142 101 217rbndl D 108 125 112 091 164 124 126 089 097 092 090 238rrovnght level 056 099 064 142 078 127 107 118 129 100rrtbill level 109 064 099 134 057 098 107 135 111rrbnds level 060 095 138 110rrbndm level 126 137 127 141rrbndl level 106 099 104 109 129 123 145 087 118 084 138 150rrovnght D 077 098 100 116 092 103 102 121 105 101rrtbill D 106 103 099 120 066 131 103 150 101rrbnds D 060 104 150 101rrbndm D 103 100 154 102rrbndl D 103 102 101 108 120 103 100 084 137 100 152 102rspread level 067 104 051 109 111 083 135 048 251exrate Dln 072 108 095 083 095 131 124rexrate Dln 072 108 115 077 094 127 124stockp Dln 096 102 099 092 150 111 104 098 101 098 100 090 127rstockp Dln 091 105 100 087 162 106 114 097 103 091 088 082 165divpr ln 083 115 144 154 141 110 101 147house Dln 084 119 134 106 093rhouse ln 069 100 176 147 133rhouse Dln 081 117 123 098 093gold Dln 127 096 112 128 104 109 105 156 105 121 124 139 101gold D2ln 109 101 100 104 100 101 101 109 103 110 100 110 101rgold ln 125 073 212 124 130 133 131 123 152 102 125 161 105rgold Dln 128 095 108 124 103 108 101 147 104 118 115 132 101silver Dln 079 135 091 077 097 133 100silver D2ln 082 100 090 072 102 108 097rsilver ln 115 288 166 124 096 202 135rsilver Dln 078 133 091 080 097 137 101ip Dln 097 089 104 096 093 098 111 099 102 115 100 100 100ip gap 103 099 120 104 100 097 101 109 095 108 108 112 107capu level 122 100 129 106 055 092 087 113emp Dln 117 095 097 098 082 102 101 120 099 100 100emp gap 110 104 120 109 136 100 101 110 104 152 106unemp level 128 096 149 147 097 117 097 109 084 152 093 114 106unemp Dln 108 103 108 092 112 109 105 105 101 139 107 097 105

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 23: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

810 Journal of Economic Literature Vol XLI (September 2003)

TABLE 3 (cont)

Transfor- Indicator mation Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99unemp gap 105 107 139 096 100 088 109 102 096 101 119 101 122 pgdp Dln 081 134 139 100 107 076 189 135 110 095 089 084 160pgdp D2ln 101 101 100 101 099 099 098 101 098 102 100 127 102cpi Dln 077 131 198 121 195 087 208 113 110 118 082 085 141cpi D2ln 103 101 104 100 103 105 101 102 103 123 104 129 122ppi Dln 092 131 043 215 151 122 103 094 090 178ppi D2ln 103 102 103 099 114 101 113 100 102 101earn Dln 091 124 113 114 098 123 097 086 095 094 204earn D2ln 102 102 100 099 095 098 103 098 103 104 101oil Dln 223 102 142 127 175 099 165 174 101 165 114 190 154oil D2ln 161 102 101 141 100 106 101 204 102 171 104 186 102roil ln 192 098 186 148 130 146 155 175 130 139 116 507 138roil Dln 242 101 138 155 151 100 153 164 101 189 115 379 137comod Dln 104 100 150 121 169 115 159 162 116 129 122 124 145comod D2ln 102 102 102 102 102 099 104 102 102 103 101 107 106rcomod ln 090 137 136 114 096 144 156 129 119 140 151 119 234rcomod Dln 102 100 138 127 135 113 103 164 109 123 099 132 101m0 Dln 105 105 102m0 D2ln 104 105 103m1 Dln 099 085 176 080 164 091 075 125 098 101 108m1 D2ln 102 097 085 104 101 094 089 080 095 103 118m2 Dln 087 102 090 068 067 098 133m2 D2ln 082 101 099 063 087 114 099m3 Dln 084 180 105 101 055 102 120 091m3 D2ln 082 104 090 095 091 109 115 100rm0 Dln 114 065 281rm1 Dln 065 113 074 058 196 069 093 068 114 062 351rm2 Dln 089 118 091 082 080 057 141rm3 Dln 089 167 123 058 068 091 079 106

Notes The second row presents the relative MSFE of a constant-change forecast of GDP (GDP follows a random walk with drift) See the notes to Table 2

money growth sometimes outperform the ARbenchmark but usually do not In some casesentire classes of predictors fail to improveupon the AR forecast For example oil pricesand commodity prices typically produce fore-casts much worse than the AR forecast andforecasts based on output gaps generally per-form slightly worse than the AR forecasts

As was the case for the inflation forecaststhe Clark-McCracken (2001) critical value forthe fixed-length models with four lags indi-cate that many of the improvements evidentin table 3 are statistically significant so the

apparent instability cannot be explained justby sampling (estimation) uncertainty of thesample relative MSFE

63 Forecast Stability

The foregoing discussion highlighted someexamples in which forecasts made using agiven asset price predictor in a certain coun-try did well in one period but poorly in an-other This could of course simply reflect theexamples we chose We therefore look moresystematically at the stability of forecastsmade using a given predictorcountryhorizon

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 24: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 811

TABLE 4 SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ASSET PRICE PREDICTORS 4 QUARTER HORIZON

A Inflation (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 006 029 036Out-of-Sample 1Period

Relative MSFE 018 046 064 1

Total 025 075 100

B Output (N = 211)

1971ndash84 Out-of-Sample Period

Relative MSFE Relative MSFE Total 1 1

1985ndash99 Relative MSFE 010 016 026Out-of-Sample 1Period

Relative MSFE 021 053 074 1

Total 031 069 100

Notes Each table shows the fraction of relative means square forecast errors (MSFE) less than 1 or greater than1 for each sample period relative to the univariate autoregressive benchmark Results shown are pooled for allpairs of asset price predictors and inflation measures (part A) or output measures (part B) for all countries athorizon h 5 4

combination as measured by the relativeMSFE in the two periods

Summary evidence on the stability of fore-casting relations based on asset prices isgiven in table 4 Table 4a presents a cross-tabulation of 211 four-quarter ahead fore-casts of inflation for all possible predictor-dependent variable pairs for the differentcountries (this table includes results for boththe GDP deflator and the CPI) Of the 211asset pricecountrydependent variable com-binations 6 percent have relative MSFEsless than one in both the first and secondperiod that is 6 percent outperform the ARbenchmark in both periods 18 percent out-perform the benchmark in the first but notthe second period 29 percent in the secondbut not the first and 46 percent are worse

than the benchmark in both periods Table4b presents analogous results for output (thetable includes results for both IP and realGDP growth)

The binary variables cross-tabulated intable 4 appear to be approximately inde-pendently distributed the joint probabilitiesare very nearly the product of the marginalprobabilities For example in panel A if therow and column variables were independentthen the probability of an indicatorcoun-tryhorizondependent variable combinationoutperforming the benchmark would be25 3 36 5 09 the empirically observedprobability is slightly less 06 In panel Bthe analogous predicted probability of out-performing the benchmark in both periodscomputed under independence is 31 3 26

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 25: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

812 Journal of Economic Literature Vol XLI (September 2003)

equals 08 the observed probability isslightly more 10 Because the draws are notindependent the chi-squared test for inde-pendence of the row and column variablesis inappropriate Still these calculationssuggest that whether an asset pricecoun-tryhorizondependent variable combinationoutperforms the benchmark in one period isnearly independent of whether it does so inthe other period

This lack of a relation between perfor-mance in the two subsamples is also evidentin figures 1a (inflation) and 1b (output)which are scatterplots of the logarithm of therelative MSFE in the first vs second periodsfor the 211 asset price-based forecasts ana-lyzed in table 4a and 4b respectively Anasset price forecast that outperforms the ARbenchmark in both periods appears as apoint in the southwest quadrant

One view of the universe of potential pre-dictors is that there are some reliable ones(perhaps the term spread or the short rate)while many others (perhaps gold and silverprices) have limited value and thus haveregression coefficients near zero If so thepoints in figure 1 would be scattered along aforty-five degree line in the southwest quad-rant with many points clustered near theorigin But this is not what figure 1 lookslike instead there are very few predictorsnear the forty-five degree line in the south-west quadrant and there are too manypoints far from the origin especially in thenorthwest and southeast quadrants If any-thing the view that emerges from figure 1 isthat performance in the two periods is near-ly unrelated or if it is related the correlationis negative asset prices that perform well inthe first period tend to perform poorly in thesecond period (the correlations in figures 1aand 1b are 222 and 221)

This pattern of instability is evidentwhether we aggregate or disaggregate theforecasts and is also present at shorter andlonger forecast horizons Figure 2 presentsthe comparable scatterplot of first v secondperiod log relative MSFEs for all 1080 avail-

able combinations of predictors (asset pricesand otherwise) countries and dependentvariables at the four-quarter horizon thepattern is similar to that in figure 1 alsoshowing a negative correlation (the correla-tion is 208) Similarly as seen in table 5this instability is evident when the forecastsare disaggregated by country As reported intable 5 the product of the marginal proba-bilities of beating the AR in the first periodtimes that for the second period very nearlyequals the joint probability of beating the ARin both periods for all countries for eitheroutput or inflation and for the two- andeight-quarter horizons as well as the four-quarter horizon A predictor that workedwell in the first periodmdashasset price or other-wisemdashis no more (and possibly less) likely tobeat the AR in the second period than a pre-dictor drawn at random from our pool

These findings of forecast instability arealso present in forecasts based on fixed laglengths (see the results supplement) so theinstability appears not to be an artifact ofrecursive BIC lag length selectionInterestingly in many cases the relativeRMSFEs of the BIC- and fixed-lag forecastsdiffer substantially by 10 or more Althoughthe overall conclusions based on tables 2ndash5and figures 1 and 2 are the same for fixed-and recursive-lag forecasts the conclusionsfor individual indicators sometimes differ bya surprising amount and this sensitivity to lagselection methods could be another indica-tion of instability in the forecasting relations

In short there appear to be no subsets ofcountries predictors horizons or variablesbeing forecast that are immune to this insta-bility Forecasting models that outperformthe AR in the first period may or may notoutperform the AR in the second butwhether they do appears to be random

64 In-Sample Tests for Predictive Contentand Instability

Because the literature surveyed in sec-tion 2 mainly uses in-sample statistics this

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 26: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 813

1b Output

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

215 210 205 00 05 10 151971ndash84

1a Inflation19

85ndash9

92

15

21

02

05

00

05

10

15

Figure 1 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Asset Price Predictors

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 27: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

814 Journal of Economic Literature Vol XLI (September 2003)

215 210 205 00 05 10 151971ndash84

Figure 2 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts All Predictors

1985

ndash99

21

52

10

20

50

00

51

01

5

section turns to in-sample measures of pre-dictive content and stability for these asset-price based forecasts The results of full-sample Granger causality tests for predictivecontent and QLR tests for instability suggestthree conclusions

First the Granger causality tests fre-quently reject indicating that a large frac-tion of these relations have substantial in-sample predictive content The Grangercausality test results are summarized in thefirst entry in each cell in table 6 For bothinflation and output forecasts the Grangercausality test rejects the null hypothesis ofno predictive content for 35 percent of theasset prices This evidence of predictive con-tent is not surprising after all these vari-ables were chosen in large part because theyhave been identified in the literature as use-ful predictors Inspection of the results for

each individual indicatorcountrydependentvariable combination (given in the resultssupplement) reveals individual Grangercausality results that are consistent withthose in the literature For example theterm spread is a statistically significant pre-dictor of GDP growth at the 1-percent levelin Canada France Germany and theUnited States but not (at the 10-percentlevel) in Italy Japan or the UnitedKingdom Exchange rates (real or nominal)are not significant predictors of GDP growthat the 5-percent level for any of the coun-tries but short-term interest rates are signif-icant for most of the countries The Grangercausality tests suggest that housing priceshave predictive content for IP growth atleast in some countries Real activity vari-ables (the IP gap the unemployment rateand capacity utilization) are significant in

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 28: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

TABLE 5SUMMARY OF PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY FOR TWO PERIODS

ALL PREDICTORS

Inflation Output

Country 1st 2nd 1amp2 1x2 N 1st 2nd 1amp2 1x2 N

Canada

2Q Ahead 036 034 010 012 80 020 054 013 011 80

4Q Ahead 046 033 010 015 80 031 066 024 021 80

8Q Ahead 034 038 014 013 80 031 047 020 015 80

France

2Q Ahead 021 024 003 005 29 021 028 007 006 29

4Q Ahead 034 031 014 011 29 028 038 014 010 29

8Q Ahead 031 041 014 013 29 014 021 000 003 29

Germany

2Q Ahead 045 028 014 013 86 035 035 015 012 86

4Q Ahead 047 024 012 011 86 036 031 014 011 86

8Q Ahead 045 027 015 012 86 040 040 017 016 86

Italy

2Q Ahead 024 044 018 010 72 028 033 011 009 72

4Q Ahead 035 032 021 011 72 032 029 011 009 72

8Q Ahead 035 039 017 014 72 028 038 017 010 72

Japan

2Q Ahead 037 024 009 009 70 019 030 001 006 70

4Q Ahead 017 030 009 005 70 023 016 004 004 70

8Q Ahead 031 027 017 009 70 019 013 003 002 70

United Kingdom

2Q Ahead 014 043 007 006 72 024 035 006 008 72

4Q Ahead 024 035 014 008 72 043 039 014 017 72

8Q Ahead 036 036 017 013 72 056 029 017 016 72

United States

2Q Ahead 030 020 002 006 132 037 044 020 016 132

4Q Ahead 038 017 005 007 132 033 035 010 012 132

8Q Ahead 039 018 007 007 132 052 033 017 017 132

All

2Q Ahead 031 030 009 009 541 028 039 012 011 541

4Q Ahead 035 027 011 010 541 033 036 013 012 541

8Q Ahead 037 030 013 011 541 038 033 015 012 541

Notes The four numbers in each cell show the fraction of relative MSFEs less than 1 in the first out-of-sampleperiod (column label 1st) in the second out-of-sample period (column label 2nd) in both the first and secondperiods (column label 1amp2) and the product of the first and the second (column label 1 3 2) Results are pooledfor all predictors the inflation results are the pooled results for the GDP deflator and CPI inflation the outputresults are the pooled results for IP growth and real GDP growth

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 29: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

816 Journal of Economic Literature Vol XLI (September 2003)

TABLE 6SUMMARY OF GRANGER CAUSALITY AND QLR TEST STATISTICS

A Summarized by Predictor Category

Inflation Output

Predictor Category GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Asset Prices 035 078 029 028 420 035 071 028 025 420

Activity 063 078 048 049 106 056 046 028 026 134

GampC Prices 032 079 025 026 216 056 077 047 043 188

Money 052 068 038 035 114 040 053 030 021 114

All 040 077 032 031 856 044 066 032 029 856

B Summarized by Country

Inflation Output

Country GC QLR GampQ G3Q N GC QLR GampQ G3Q N

Canada 052 074 040 038 124 050 068 033 034 124

France 041 084 035 034 108 052 068 038 035 108

Germany 043 070 032 030 118 035 067 025 023 118

Italy 033 083 025 027 126 035 062 025 022 126

Japan 034 085 029 029 122 041 066 031 027 122

United Kingdom 033 054 021 018 112 036 058 024 021 112

United States 04 082 038 037 146 056 073 047 041 146

All 040 077 032 031 856 044 066 032 029 856

Notes The Granger causality and QLR test statistics are heteroskedasticity-robust and were computed for a one-quarter ahead bivariate in-sample (full-sample) regression (h 5 1 in equation (3)) The five numbers in each cellare the fraction of bivariate models with significant (5) GC statistics (column label GC) significant (5) QLRstatistics (column label QLR) significant GC and QLR statistics (column label GampQ) the product of the firstand second (column label G 3 Q) and the number of models in each cell The models making up each cell arethe pooled results using CPI the GDP deflator real GDP and IP

most of the inflation equations Over all cat-egories of predictors 40 percent rejectGranger noncausality for inflation and 44 percent reject Granger noncausality foroutput growth The term spread has limitedpredictive content for inflation based on the Granger causality statistic it enters the GDP inflation equation significantly (at the 5-percent level) for only France and Italyand it enters the CPI inflation equation sig-nifi-cantly for only France Italy and theUnited States This finding is consistent withBernanke and Mishkin (1992) Kozicki(1997) and Estrella and Mishkin (1997) and

suggests that the predictive content of theterm spread for inflation found elsewhere inthe literature is a consequence of omittinglagged values of inflation

Second the QLR statistic detects wide-spread instability in these relations Theresults in table 6 indicate that among fore-casting equations involving asset prices theQLR statistic rejects the null hypothesis ofstability (at the 5-percent level) in 78 per-cent of the inflation forecasting relations andin 71 percent of the output forecasting rela-tions This further suggests that the instabil-ity revealed by the analysis of the relative

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 30: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 817

MSFEs in the two subsamples is not a statis-tical artifact but rather is a consequence ofunstable population relations

Third a statistically significant Grangercausality statistic conveys little if any informa-tion about whether the forecasting relation isstable This can be seen in several ways Forexample the frequent rejection of Grangernoncausality contrasts with the findings ofsection 63 table 5 (last panel) reports thatonly 9 percent of all predictors improve uponthe AR benchmark forecast of inflation twoquarters ahead in both the first and secondperiod yet table 6 reports that Granger non-causality is rejected for 40 percent of all pre-dictors Similarly only 12 percent of the pre-dictors of output growth improve upon theAR benchmark in both periods at two quar-ters ahead but Granger noncausality isrejected in 44 percent of the relations Figure3 is a scatterplot of the log relative MSFErestricted to predictors (asset prices and oth-erwise) and dependent variables (inflation

and output) for which the Granger causalitytest rejects at the 5-percent significance levelIf relations that show in-sample predictivecontent were stable then the points would liealong the 45deg line in the southwest quadrantbut they do not A significant Granger causal-ity statistic makes it no more likely that a pre-dictor outperforms the AR in both periods

Related evidence is reported in the thirdand fourth entries of each cell of table 6which respectively contain the product ofthe marginal rejection probabilities of theGranger causality and QLR tests and thejoint probability of both rejecting The jointprobability is in every case very close to theproduct of the marginal probabilities rejec-tion of Granger noncausality appears to beapproximately unrelated to whether or notthe QLR statistic rejects These findingshold with some variation for all the predic-tor categorycountrydependent variablecombinations examined in table 6 The QLRstatistics suggest a greater amount of insta-

215 210 205 00 05 10 151971ndash84

1985

ndash99

21

52

10

20

50

00

51

01

5

Figure 3 Scatterplot of Pseudo Out-of-Sample Log Relative Mean Squared Forecast Errors 4-Quarter Ahead Forecasts Predictors with Significant Granger Causality Statistics

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 31: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

818 Journal of Economic Literature Vol XLI (September 2003)

21 0 1 2 3 4 5Log GC Statistic

Figure 4 Scatterplot of Granger Causality and QLR Statistics All Predictors

Log

QL

R S

tatis

tic1

23

45

bility in the inflation forecasts than in theoutput forecasts Across countries inflationforecasts for Japan are most frequentlyunstable and output forecasts for the UnitedKingdom are the least frequently unstableAmong predictor categorydependent vari-able pairs the greatest instability is amonggoods and commodity prices as predictors ofinflation and the least is among activity vari-ables as predictors of output In all caseshowever the QLR and Granger causalitystatistics appear to be approximately inde-pendently distributed These findings arerecapitulated graphically in figure 4 a scat-terplot of the log of the QLR statistic vs thelog of the Granger causality F-statistic (allcases) that evinces little relation between thetwo statistics

65 Monte Carlo Check of SamplingDistribution of Relative MSFEs

As an additional check that the instabilityin the relative MSFEs is not just a conse-

quence of sampling variability we undertooka Monte Carlo analysis designed to match anempirically plausible null model with stablebut heterogeneous predictive relationsSpecifically for each of the 788 indicatorcountrydependent variable combinationsfor which we have complete data from 1959to 1999 the full available data set was usedto estimate the VAR Wt 5 m 1 A(L)Wt211 Vt where Wt 5 (Yt Xt) where Yt is the variable to be forecast and Xt is the can-didate predictor and Vt is an error vectorFor the ith such pair this produced esti-mates of the VAR parameters ui where u 5 (m A(L) Sv) for i 5 1hellip 788 Thiscollection of VAR coefficients constitutesthe distribution of models used for theMonte Carlo experiment

With this empirical distribution in handthe artificial data were drawn as follows

1 VAR parameters u were drawn from theirjoint empirical distribution

2 Artificial data on Wt 5 (Yt Xt) were

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 32: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 819

TABLE 7DIFFERENCES IN FIRST AND SECOND PERIOD RELATIVE MSFE

FOUR-QUARTER AHEAD FORECASTS

Median 75ndash25 Range 90ndash10 Range

Data 001 034 100

Simulations 002 012 027

Notes The entries summarize the distribution of the difference between first and second period relative MSFEsfor 4-quarter ahead forecasts The first row summarizes results for the 788 countryvariable pairs for which com-plete data from 1959ndash99 are available The second row summarizes results from 5000 simulated countryvariablepairs using a Monte Carlo design described in the text

generated according to a bivariate VARwith these parameters and Gaussianerrors with the number of observationsmatching the full sample used in theempirical analysis

3 Benchmark and bivariate forecasts of Ytwere made using the recursive multistepahead forecasting method outlined insections 2 and 4

4 Relative MSFEs for the two periods (sim-ulated 1971ndash84 and 1985ndash99) were com-puted as described in section 4 and thechange between the relative MSFEs inthe two periods was computed

In this design the distributions of thechange in the relative MSFEs incorporatesboth the sampling variability of these statis-tics conditional on the VAR parameters andthe (empirical) distribution of the estimatedVAR parameters

The results are summarized in table 7The main finding is that the distribution ofthe change in the relative MSFEs is muchtighter in the Monte Carlo simulation thanin the actual datamdashapproximately threetimes tighter at the quartiles and four timestighter at the outer deciles We concludethat sampling variation is insufficient toexplain the dramatic shifts in predictive con-tent observed in the data even afteraccounting for heterogeneity in the predic-tive relations Said differently if the predic-tive relations were stable it is quite unlikely

that we would have observed as many casesas we actually did with small relative MSFEsin one period and large relative MSFEs inthe other period

66 Trivariate Models

In addition to the bivariate models weconsidered forecasts based on trivariatemodels of the form (5) The trivariate mod-els for inflation included lags of inflation theIP gap and the candidate predictor Thetrivariate models for output growth includedlags of output growth the term spread andthe candidate predictor

Relative MSFEs are given for all indica-torscountriesdependent variableshorizonsin the results supplement The main conclu-sions drawn from the bivariate models alsohold for the trivariate models In somecountries and some time periods some indi-cators perform better than the bivariatemodel For example in Canada it wouldhave been desirable to use the unemploy-ment rate in addition to the IP gap for fore-casting CPI inflation in the second period(but not the first) in Germany it would havebeen desirable to use M2 growth in additionto the IP gap in the first period (but not thesecond)

There are however no clear systematicpatterns of improvement when candidateindicators are added to the bivariate modelRather the main pattern is that the trivariate

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 33: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

820 Journal of Economic Literature Vol XLI (September 2003)

relative MSFEs show subsample instabilitysimilar to those of the bivariate relativeMSFEs This instability is presumably inpart driven by the instability of the bivariaterelation that the trivariate relation extendsFor example all the trivariate models of out-put growth perform poorly in the UnitedStates in the second period reflecting thepoor performance of the term spread overthis period

7 Results for Combination Forecasts

This section examines the possibility thatcombining the forecasts based on the indi-vidual indicators can improve their perfor-mance The standard logic of combinationforecasts is that by pooling forecasts basedon different data the combined forecastuses more information and thus should bemore efficient than any individual forecastEmpirical research on combination forecastshas established that simple combinationssuch as the average or median of a panel offorecasts frequently outperform the con-stituent individual forecasts see the reviewin R T Clemen (1989) and the introductionsto combination forecasts in Diebold (1998)and Paul Newbold and David Harvey(2002) The theory of optimal linear combi-nation forecasts (J M Bates and Granger1969 Granger and Ramu Ramanathan 1984)suggests that combination forecasts shouldbe weighted averages of the individual fore-casts where the optimal weights correspondto the population regression coefficients in aregression of the true future value on thevarious forecasts One of the intriguingempirical findings in the literature on com-bination forecasts however is that theoreti-cally ldquooptimalrdquo combination forecasts oftendo not perform as well as simple means ormedians

The combination forecasts consideredhere are the trimmed mean of a set of fore-casts where the lowest and highest forecastswere trimmed to mitigate the influence of

occasional outliers results for combinationforecasts based on the median are similarand are given in the Results SupplementThe relative MSFEs of these combinationforecasts are summarized in table 8 and theresults are striking First consider the resultsfor asset-price based forecasts of inflation(the first three rows of parts A and B) Thetrimmed mean of all the individual forecastsof CPI inflation outperforms the benchmarkAR in most of the countryhorizonperiodcombinations and the relative RMSFEnever exceeds 109 When all forecasts (allpredictor groups) are combined the combi-nation CPI inflation forecast improves uponthe AR forecast in 41 of 42 cases and in theremaining case the relative RMSFE is 100The overall combination GDP inflation fore-casts improve upon the benchmark AR in 35of the 39 countryhorizon cases and its worstRMSFE is 104

Inspection of the results for differentgroups of indicators reveals that theseimprovements are realized across the boardFor Canada Germany the United Kingdomand the United States the greatest improve-ments tend to obtain using the combinationinflation forecasts based solely on the activityindicators while for France Italy and Japanthe gains are typically greatest if all availableforecasts are used In several cases the com-bination forecasts have relative MSFEs under080 so that these forecasts provide substan-tial improvements over the AR benchmark

The results for combination forecasts ofoutput growth are given in parts C and D oftable 8 The combination forecasts usuallyimprove upon the AR benchmark some-times by a substantial amount Interestinglythe combination forecasts based only on assetprices often have a smaller MSFE than thosebased on all predictors It seems that forforecasting output growth adding predictorsbeyond asset prices does not reliably improveupon the combination forecasts based onasset prices Even though the individual fore-casts based on asset prices are unstable com-

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 34: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 821

TABLE 8PSEUDO OUT-OF-SAMPLE FORECAST ACCURACY OF COMBINATION FORECASTS

A Consumer Price Index

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 093 099 093 089 101 102 109 090 109 090 099 090 086 093

4Q Horizon 085 096 092 088 101 101 102 089 106 091 084 086 089 094

8Q Horizon 080 094 088 093 098 100 094 095 091 081 089 074 084 085

Activity

2Q Horizon 097 091 100 091 100 094 091 099 094 097 081 075 089

4Q Horizon 096 077 095 083 097 093 090 099 107 086 074 072 082

8Q Horizon 090 059 131 068 088 104 093 107 097 069 067 072 069

All

2Q Horizon 093 095 094 090 097 100 099 078 092 082 096 085 083 089

4Q Horizon 090 089 093 086 092 099 093 080 091 078 084 078 084 088

8Q Horizon 087 083 093 088 088 094 093 085 091 065 081 066 081 078

B GDP Deflator

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 104 102 093 097 097 100 098 102 107 101 104 110 094

4Q Horizon 098 097 089 098 101 100 092 098 109 107 096 097 090

8Q Horizon 088 092 081 096 098 083 091 090 102 097 084 083 086

Activity

2Q Horizon 092 092 108 082 090 089 103 106 094 093 105 094 095

4Q Horizon 092 086 121 076 086 101 094 105 077 095 094 088 089

8Q Horizon 101 075 122 080 078 109 094 130 078 067 100 076 083

All

2Q Horizon 100 096 092 088 095 095 096 099 099 095 099 103 094

4Q Horizon 094 091 083 080 095 094 089 091 093 100 089 094 087

8Q Horizon 104 086 081 071 092 082 089 095 085 086 069 082 077

bined they perform well across the differenthorizons and countries Notably in theUnited States the relative mean squaredforecast error for 8-quarter ahead forecastsof industrial production growth based on thecombined asset price forecast is 055 in thefirst period and 090 in the second period

Results for combination forecasts basedon the trivariate models are presented inthe results supplement The trivariateforecasts typically improve upon the bench-mark AR forecasts however the improve-ments are not as reliable nor are they aslarge as for the bivariate forecasts We

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 35: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

822 Journal of Economic Literature Vol XLI (September 2003)

TABLE 8 (cont)

C Real GDP

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 092 075 098 078 098 093 092 092 094 104 095 084 096

4Q Horizon 089 076 101 071 104 083 083 104 090 083 098 076 101

8Q Horizon 082 075 096 082 097 078 066 119 094 079 106 063 100

All

2Q Horizon 096 083 099 086 097 095 087 091 094 101 093 086 097

4Q Horizon 094 087 104 080 102 084 082 101 090 087 097 082 101

8Q Horizon 090 087 098 090 098 078 070 103 091 089 104 076 099

D Industrial Production

Canada France Germany Italy Japan UK US

71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99 71ndash84 85ndash99

Asset Prices

2Q Horizon 088 093 080 093 079 089 091 094 090 096 098 092 087 093

4Q Horizon 083 084 081 089 072 090 089 083 094 091 085 091 066 095

8Q Horizon 080 075 085 086 082 082 082 078 107 079 076 100 055 090

All

2Q Horizon 093 094 083 094 080 094 091 092 086 094 096 090 086 093

4Q Horizon 087 094 081 094 078 098 088 083 087 089 083 091 074 096

8Q Horizon 086 087 085 088 088 090 084 079 094 082 081 099 070 093

Notes Entries are the relative mean square forecast errors for the combined forecasts constructed from the vari-ables in the categories listed in the first column relative to the AR benchmark

interpret this as arising because the trivari-ate models all have a predictor in common(the IP gap for inflation the term spreadfor output) This induces common instabili-ties across the trivariate models which inturn reduces the apparent ability of thecombination forecast to ldquoaverage outrdquo theidiosyncratic instability in the individualforecasts

8 Discussion and Conclusions

This review of the literature and our empir-ical analysis lead us to four main conclusions

1 Some asset prices have been useful pre-dictors of inflation andor output growth insome countries in some time periods Forexample the term spread was a useful pre-dictor of output growth in the United Statesand Germany prior to the mid-1980s Theempirical analysis in section 6 like those inthe literature occasionally found large fore-cast improvements using an asset price Thissaid no single asset price is a reliable predic-tor of output growth across countries overmultiple decades The term spread perhapscomes closest to achieving this goal but itsgood performance in some periods and coun-

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 36: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 823

tries is offset by poor performance in otherperiods andor countries As for inflationforecasts after controlling for lagged infla-tion individual asset prices provide improve-ments that are sometimes modest but rarelylarge relative to the AR benchmark For theUnited States in the first period the termspread helped to predict inflation in ourpseudo out-of-sample forecasting exercisebut this was not the case in other countries orin the second period in the United StatesStill even if the improvements are small inmany cases our study (like previous ones)finds these improvements to be statisticallysignificant Said differently when asset pricesimprove forecasts of inflation or outputgrowth these improvements often appear tobe real in the sense that they are unlikely tohave arisen just from the sampling variabilityof the relative MSFE under the null hypoth-esis that the population regression coeffi-cients on the asset price are zero

2 There is considerable instability inbivariate and trivariate predictive relationsinvolving asset prices and other predictorsIn our pseudo out-of-sample forecast com-parison we found that whether a predictorforecasts better than an autoregression inthe first out-of-sample period is essentiallyunrelated to whether it will do so in the sec-ond period This finding of instability in pre-dictive relations is confirmed by widespreadrejections of the null hypothesis of constantcoefficients by the (in-sample) QLR statistic

This empirical finding of instability is con-sistent with our reading of the literature onasset prices as predictors of output and infla-tion in which an initial series of papers iden-tifies what appears to be a potent predictiverelation that is subsequently found to breakdown in the same country not to be presentin other countries or both On the one handthis finding of instability is surprising for thelogic behind using asset prices for forecast-ing includes some cornerstone ideas ofmacroeconomics the Fisher hypothesis theidea that stock prices reflect expected futureearnings and the notion that temporarily

high interest rates lead to an economic slow-down On the other hand it makes sensethat the predictive power of asset pricescould depend on the nature of the shockshitting the economy and the degree of devel-opment of financial institutions which differacross countries and over time Indeed sev-eral of the papers reviewed in section 3underscore the situational dependence ofthe predictive content of asset prices Cook(1981) and Duca (1999) provide detailedinstitutional interpretations of the predictivepower of specific asset prices and Smets andTsatsaronis (1997) (among others) empha-size that different combinations of shocksand policies can lead to different degrees ofpredictive performance for asset pricesThese considerations suggest that assetprices that forecast well in one country or inone period might not do so in another Ofcourse this interpretation of these results isnot very useful if these indicators are to beused prospectively for forecasting accordingto this argument one must know the natureof future macroeconomic shocks and institu-tional developments that would make a par-ticular candidate indicator stand out It isone thing to understand ex post why a par-ticular predictive relation broke down it isquite another to know whether it will ex ante

Looked at more broadly the instabilitypresent in forecasts based on asset prices isconsistent with other evidence of instability inthe economy The US productivity slow-down of the mid-1970s and its revival in thelate 1990s represent structural shifts Recentresearch on monetary policy regimes pro-vides formal empirical support for the con-ventional wisdom that there have been sub-stantial changes in the way the FederalReserve Bank has conducted monetary policyover the past forty years (Bernanke and IlianMihov 1998 Cogley and Sargent 2001 2002Richard Clarida Jordi Galiacute and Mark Gertler2000 Sims and Tao Zha 2002) Europe hasseen sweeping institutional changes in mone-tary policy and trade integration over the pastforty years Other research on forecasting

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 37: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

824 Journal of Economic Literature Vol XLI (September 2003)

using low-dimensional models emphasizesthe widespread nature of parameter instabil-ity (Stock and Watson 1996 MichaelClements and David Hendry 1999) In addi-tion there has been a reduction in the volatil-ity of many macroeconomic variables includ-ing output growth and inflation in the UnitedStates (and it appears other countries) thathas persisted since the mid-1980s (C J Kimand Charles Nelson 1999 and MargaretMcConnell and Gabriel Perez-Quiros 2000for recent reviews see Olivier Blanchard andJohn Simon 2001 and Stock and Watson2002) Work on this moderation in volatilityhas identified a number of possible explana-tions changes in methods of inventory man-agement shifting sectoral composition of theeconomy changes in the financial sector thatreduce the cyclical sensitivity of production ofdurables and residential housing changes inthe size and nature of the shocks to the econ-omy and changes in monetary policy Forexample a successful shift of monetary policyto an inflation targeting regime in whichfuture deviations from the target were unex-pected would have the effect of making pre-viously potent predictive relations no longeruseful although such a shift generally wouldnot eliminate the predictability of outputfluctuations In principal any of these shiftscould result in changes to the reduced-formforecasting relations examined in this article

These observations suggest a number ofimportant directions for future research In apractical sense the previous paragraph pro-vides too many potential explanations forthese shifting relations and it seems impor-tant to obtain a better economic understand-ing of the nature and sources of thesechanges on a case-by-case basis From theperspective of forecasting methods this evi-dence of sporadic predictive content posesthe challenge of developing methods thatprovide reliable forecasts in the face of time-varying relations Conventional time-varyingparameter models and forecasts based ontrun-cated samples or windows are a naturalapproach but these methods do not seem to

lead to useful forecasts at least for low-dimensional models fit to US postwar data(Stock and Watson 1996 1999c) these mod-els reduce bias at the expense of increasingvariance so MSFEs frequently increaseOther possibilities include intercept correc-tions and overdifferencing (Clements andHendry 1999) Alternatively the evidencepresented here does not rule out the possi-bility that some fixed-parameter nonlinearmodels might produce stable forecasts andthat linear models simply represent state-dependent local approximations empiricallyhowever nonlinear models can produce evenwilder pseudo out-of-sample forecasts thanlinear models (Stock and Watson 1999c) Inany event the challenge of producing reli-able forecasts using low-dimensional modelsbased on asset prices remains open

3 In-sample Granger causality tests pro-vide a poor guide to forecast performanceThe distribution of relative MSFEs in thetwo periods for the subset of predictive rela-tions with a statistically significant Grangercausality statistic is similar to the distributionof relative MSFEs for all the predictive rela-tions in this sense rejection of Grangernoncausality does not provide useful infor-mation about the predictive value of theforecasting relation Similarly the Grangercausality statistic is essentially uncorrelatedwith the QLR statistic so the Grangercausality statistic provides no informationabout whether the predictive relation is sta-ble In short we find that rejection ofGranger noncausality is to a first approxima-tion uninformative about whether the rela-tion will be useful for forecasting

The conclusion that testing for Grangernoncausality is uninformative for assessingpredictive content is not as counterintuitiveas it initially might seem One model consis-tent with this finding is that the relation hasnon-zero coefficients at some point but thosecoefficients change at an unknown dateClark and McCracken (2002) provide theo-retical and Monte Carlo evidence that thesingle-break model is capable of generating

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 38: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 825

results like ours Their theoretical resultscombined with our empirical findings sug-gest that future investigations into predictivecontent need to use statistics such as breaktests and pseudo out-of-sample forecastingtests that can detect instability in predictiverelations Additional theoretical work onwhich set of tests has best size and power isin order

4 Simple combination forecasts reliablyimprove upon the AR benchmark and fore-casts based on individual predictorsForecasts of output growth constructed asthe median or trimmed mean of the fore-casts made using individual asset prices reg-ularly exhibit smaller pseudo out-of-sampleMSFEs than the autoregressive benchmarkand typically perform nearly as well as orbetter than the combination forecast basedon all predictors In this sense asset pricestaken together have predictive content foroutput growth Moreover the combinationforecasts are stable even though the individ-ual predictive relations are unstable Thevalue of asset prices for forecasting inflationis less clear although combination forecastsof inflation using real activity indicatorsimprove upon the autoregressive bench-mark further forecasting gains from incor-porating asset prices are not universal butinstead arise only for selected countries andtime periods In most cases the combinationforecasts improve upon the Atkeson-Ohanian (2001) seasonal random walk fore-cast sometimes by a substantial margin

The finding that averaging individuallyunreliable forecasts produces a reliable com-bination forecast is not readily explained bythe standard theory of forecast combinationwhich relies on information pooling in a sta-tionary environment Rather it appears thatthe instability is sufficiently idiosyncraticacross series for the median forecast toldquoaverage outrdquo the instability across the indi-vidual forecasting relations Fully articulatedstatistical or economic models consistentwith this observation could help to producecombination forecasts with even lower

MSFEs Developing such models remains atask for future research

AppendixThe sample dates and sampling frequencies are list-

ed for each variable by country in table 9 The originalsources for the series are given in the web results sup-plement

Some of the variables exhibited large outliers due tostrikes redefinitions etc As discussed in the text theseobservations were replaced by the median of the threeobservations on either side of the observation(s) inquestion The observations with interpolated valuesare France IP March 1963 and MayndashJune 1968United Kingdom PPI January 1974 Germany M3July 1990 France M3 December 1969 and January1978 Germany unemployment rate January 1978January 1984 and January 1992 Germany M1January 1991 Germany real and nominal GDP19911 Italy real and nominal GDP 19701 and Japanreal GDP 19791

The gap variables were constructed as the deviationof the series from a one-sided version of the Hodrick-Prescott (1981) (HP) filter The one-sided HP gap esti-mate is constructed as the Kalman filter estimate of laquotfrom the model yt 5 tt 1 laquot and D2tt 5 ht where yt isthe observed series tt is its unobserved trend compo-nent and laquot and ht are mutually uncorrelated whitenoise sequences with relative variance var(laquot)var(ht)As discussed in Andrew Harvey and Albert Jaeger(1993) and Robert King and Sergio Rebelo (1993) theHP filter is the minimum mean square error lineartwo-sided trend extraction filter for this modelBecause our focus is on forecasting we use the optimalone-sided analogue of this filter so that future values ofyt (which would not be available for real time forecast-ing) are not used in the detrending operation The fil-ter is implemented with var(laquot)var(ht) 5 00675which corresponds to the usual value of the HPsmoothing parameter (l 5 1600)

REFERENCES

Akerlof George A William T Dickens and George LPerry 2000 ldquoNear-Rational Wage and Price Settingand the Optimal Rates of Inflation and Unemploy-mentrdquo Brookings Papers Econ Act 20001 pp1ndash44

Andrews Donald WK 1993 ldquoTests for ParameterInstability and Structural Change with UnknownChange Pointrdquo Econometrica 614 pp 821ndash56

Ang Andrew Monika Piazzesi and Min Wei 2003ldquoWhat Does the Yield Curve Tell Us About GDPGrowthrdquo manuscript Columbia Business School

Atkeson Andrew and Lee E Ohanian 2001 ldquoArePhillips Curves Useful for Forecasting Inflationrdquo FedReserve Bank Minneapolis Quart Rev 251 pp 2ndash11

Atta-Mensah Joseph and Greg Tkacz 2001ldquoPredicting Recessions and Booms Using FinancialVariablesrdquo Can Bus Econ 83 pp 30ndash36

Barr David G and John Y Campbell 1997 ldquoInflationReal Interest Rates and the Bond Market A Study

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 39: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

826 Journal of Economic Literature Vol XLI (September 2003)

TABLE 9DATA SAMPLE PERIODS

Series Canada France Germany Italy Japan UK US

Interest rate overnight 751ndash9912m 641ndash993m 601ndash9912m 711ndash9912m 591ndash9912m 721ndash9912m 591ndash9912m

short-term govrsquot bills 591ndash9912m 701ndash9912m 757ndash9912m 745ndash996m 641ndash9912m 591ndash9912m

short-term govrsquot 702ndash996m 661ndash9912m 591ndash9912mbonds

medium-term govrsquot 591ndash9912m 591ndash9912mbonds

long-term govrsquot bonds 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 6610ndash999m 591ndash9912m 591ndash9912m

Exchange rate 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m 731ndash9912m

Stock prices 591ndash9912m 591ndash9912m 591ndash9912 591ndash9912m 591ndash9912m 591ndash993m 591ndash9912m

Dividend price index 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 701ndash971q 591ndash964q

Housing price index 701ndash984q 701ndash984q 701ndash984q 701ndash984q

Gold price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Silver price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Real GDP 591ndash994q 701ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Nominal GDP 591ndash994q 651ndash994q 601ndash994q 601ndash994q 591ndash994q 591ndash994q 591ndash994q

Industrial production 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9812m 591ndash9912m 591ndash9912m 591ndash9912m

Capacity utilization rate 621ndash994q 761ndash994q 701ndash994q 621ndash984m 681ndash9912 591ndash9912m

Employment 591ndash9912m 701ndash994q 601ndash9912m 601ndash904q 591ndash991q 601ndash994q 591ndash9912m

Unemployment rate 591ndash9912m 744ndash991q 621ndash9912m 601ndash994q 591ndash991q 601ndash994q 591ndash9912m

CPI 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

PPI 591ndash9912m 591ndash9911m 811ndash9911m 591ndash9912m 591ndash9912m 591ndash9912m

Earnings 591ndash9912m 601ndash994m 621ndash9912m 591ndash9912m 631ndash9912m 591ndash9912m

Oil price 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Commodity price index 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m 591ndash9912m

Money supply m0 701ndash9912m 591ndash9912m

Money supply m1 591ndash9912m 771ndash984m 601ndash984m 621ndash984m 631ndash9912m 591ndash9912m

Money supply m2 591ndash9912m 601ndash984m 741ndash984m 671ndash9912m 591ndash9912m

Money supply m3 591ndash9912m 601ndash984m 691ndash9812m 621ndash984m 7112ndash9912m 591ndash9912m

Notes The table entries show the sample periods of each data series for each country Blank cells indicate miss-ing data m means the data series is monthly and q means quarterly

of UK Nominal and Index-Linked GovernmentBond Pricesrdquo J Monet Econ 393 pp 361ndash83

Barro Robert J 1990 ldquoThe Stock Market andInvestmentrdquo Rev Finan Stud 31 pp 115ndash31

Bates JM and Clive WJ Granger 1969 ldquoTheCombination of Forecastsrdquo Operations Res Quart20 pp 319ndash25

Bernanke Ben S 1983 ldquoNonmonetary Effects of theFinancial Crisis in the Propagation of the GreatDepressionrdquo Amer Econ Rev 733 pp 257ndash76

mdashmdashmdash 1990 ldquoOn the Predictive Power of InterestRates and Interest Rate Spreadsrdquo New EnglandEcon Rev NovDec pp 51ndash68

Bernanke Ben S and Alan S Blinder 1992 ldquoTheFederal Funds Rate and the Channels of Monetary

Transmissionrdquo Amer Econ Rev 824 pp 901ndash21Bernanke Ben S and Ilian Mihov 1998 ldquoMeasuring

Monetary Policyrdquo Quart J Econ 1133 pp 869ndash902Bernanke Ben S and Frederic S Mishkin 1992 ldquoThe

Predictive Power of Interest Rate Spreads Evidencefrom Six Industrialized Countriesrdquo manuscriptPrinceton U

Bernard Henri and Stefan Gerlach 1998 ldquoDoes theTerm Structure Predict Recessions The InternationalEvidencerdquo Int J Finance Econ 33 pp 195ndash215

Blanchard Olivier 1993 ldquoConsumption and theRecession of 1990ndash1991rdquo Amer Econ Rev 832 pp270ndash74

Blanchard Olivier and John Simon 2001 ldquoThe Longand Large Decline in US Output Volatilityrdquo

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 40: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 827

Brookings Papers Econ Act 20011 pp 135ndash64Bonser-Neal Catherine and Timothy R Morley 1997

ldquoDoes the Yield Spread Predict Real EconomicActivity A Multicountry Analysisrdquo Fed ReserveBank Kansas City Econ Rev 823 pp 37ndash53

Brainard William C and George L Perry 2000ldquoMaking Policy in a Changing Worldrdquo in EconomicEvents Ideas and Policies The 1960s and AfterGL Perry and J Tobin eds Brookings InstitutionWashington DC

Campbell John Y 1999 ldquoAsset Prices Consumptionand the Business Cyclerdquo in The Handbook ofMacroeconomics Vol 1 John B Taylor and MichaelWoodford eds Amsterdam Elsevier pp 1231ndash303

Campbell John Y Martin Lettau Burton G Malkieland Yexaio Xu 2001 ldquoHave Individual StocksBecome More Volatile An Empirical Exploration ofIdiosyncratic Riskrdquo J Finance 561 pp 1ndash43

Canova Fabio and Gianni De Nicolo 2000 ldquoStockReturn Term Structure Real Activity and InflationAn International Perspectiverdquo Macroecon Dynam43 pp 343ndash72

Cecchetti Stephen G Rita S Chu and CharlesSteindel 2000 ldquoThe Unreliability of InflationIndicatorsrdquo Fed Reserve Bank New York CurrentIssues Econ Finance 64 pp 1ndash6

Chen Nai-Fu 1991 ldquoFinancial Investment Oppor-tunities and the Macroeconomyrdquo J Finance 462 pp 529ndash54

Clarida Richard Jordi Galiacute and Mark Gertler 2000ldquoMonetary Policy Rules and Macroeconomic Stabil-ity Evidence and Some Theoryrdquo Quart J Econ1151 pp 147ndash80

Clark Todd E and Michael W McCracken 2001ldquoTests of Equal Forecast Accuracy and Encompass-ing for Nested Modelsrdquo J Econometrics 105 pp85ndash100

mdashmdashmdash 2002 ldquoForecast-Based Model Selection in thePresence of Structural Breaksrdquo Fed Reserve BankKansas City discuss paper RWP 02-05

Clemen RT 1989 ldquoCombining Forecasts A Reviewand Annotated Bibliographyrdquo Int J Forecasting 5pp 559ndash83

Clements Michael P and David F Hendry 1999Forecasting Non-stationary Economic Time SeriesCambridge MA MIT Press

Cogley Timothy and Thomas J Sargent 2001 ldquoEvolvingPost-World War II US Inflation Dynamicsrdquo NBERMacroeconomics Annual pp 331ndash72

mdashmdashmdash 2002 ldquoDrifts and Volatilities MonetaryPolicies and Outcomes in the Post WWII USrdquomanuscript Stanford U

Cook Timothy 1981 ldquoDeterminants of the Spreadbetween Treasury Bill Rates and Private SectorMoney Market Ratesrdquo J Econ Bus 33 pp 177ndash87

Croushore Dean and Tom Stark 2003 ldquoA Real-TimeData Set for Macroeconomists Does the DataVintage Matterrdquo forthcoming Rev Econ Statist

Davis E Philip and Gabriel Fagan 1997 ldquoAreFinancial Spreads Useful Indicators of FutureInflation and Output Growth in EU Countriesrdquo JApplied Econometrics 12 pp 701ndash14

Davis E Philip and SGB Henry 1994 ldquoThe Use ofFinancial Spreads as Indicator Variables Evidence

for the United Kingdom and Germanyrdquo IMF StaffPapers 41 pp 517ndash25

Diebold Francis X 1998 Elements of ForecastingCincinnati Southwestern Publishing

Diebold Francis X and Robert S Mariano 1995ldquoComparing Predictive Accuracyrdquo J Bus EconStatist 13 pp 253ndash63

Dotsey Michael 1998 ldquoThe Predictive Content of theInterest Rate Term Spread for Future EconomicGrowthrdquo Fed Reserve Bank Richmond Econ Quart843 pp 31ndash51

Duca John V 1999 ldquoWhat Credit Market IndicatorsTell Usrdquo Fed Reserve Bank Dallas Econ Finan Rev1999Q3 pp 2ndash13

Dueker Michael J 1997 ldquoStrengthening the Case forthe Yield Curve as a Predictor of US RecessionsrdquoFed Reserve Bank St Louis Rev 792 pp 41ndash50

Emery Kenneth M 1996 ldquoThe Information Contentof the PaperndashBill Spreadrdquo J Econ Business 48 pp1ndash10

Estrella Arturo and Frederic S Mishkin 1997 ldquoThePredictive Power of the Term Structure of InterestRates in Europe and the United States Implicationsfor the European Central Bankrdquo Europ Econ Rev41 pp 1375ndash401

mdashmdashmdash 1998 ldquoPredicting US Recessions FinancialVariables as Leading Indicatorsrdquo Rev EconStatistics 80 pp 45ndash61

Estrella Arturo and Gikas Hardouvelis 1991 ldquoTheTerm Structure as a Predictor of Real EconomicActivityrdquo J Finance 462 pp 555ndash76

Estrella Arturo Anthony P Rodrigues and SebastianSchich 2003 ldquoHow Stable is the Predictive Power of the Yield Curve Evidence from Germany and the United Statesrdquo Rev Econ Statistics 853 forth-coming

Fama Eugene F 1981 ldquoStock Returns Real ActivityInflation and Moneyrdquo Amer Econ Rev 71 pp545-65

mdashmdashmdash 1990 ldquoTerm-Structure Forecasts of InterestRates Inflation and Real Returnsrdquo J Monet Econ251 pp 59ndash76

Feldstein Martin and James H Stock 1994 ldquoThe Useof a Monetary Aggregate to Target Nominal GDPrdquoin NG Mankiw (ed) Monetary Policy Chicago UChicago Press for the NBER pp 7ndash70

Fischer Stanley and Robert C Merton 1984 ldquoMacro-economics and Finance The Role of the StockMarketrdquo Carnegie-Rochester Conference Series onPublic Policy 21 pp 57ndash108

Frankel Jeffrey A 1982 ldquoA Technique for Extracting aMeasure of Expected Inflation from the InterestRate Term Structurerdquo Rev of Econ and Statistics641 pp 135ndash142

Frankel Jeffrey A and Cara S Lown 1994 ldquoAnIndicator of Future Inflation Extracted from theSteepness of the Interest Rate Yield Curve Along ItsEntire Lengthrdquo Quart J Econ 59 pp 517ndash30

Friedman Benjamin M and Kenneth N Kuttner1992 ldquoMoney Income Prices and Interest RatesrdquoAmer Econ Rev 82June pp 472ndash92

mdashmdashmdash 1993a ldquoWhy Does the Paper-Bill SpreadPredict Real Economic Activityrdquo in Business CyclesIn-dicators and Forecasting James H Stock and

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 41: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

828 Journal of Economic Literature Vol XLI (September 2003)

Mark W Watson eds Chicago U Chicago Pressmdashmdashmdash 1993b ldquoEconomic Activity and the Short-term

Credit Markets An Analysis of Prices and Quan-titiesrdquo Brookings Papers Econ Act 19932 pp193ndash266

mdashmdashmdash 1998 ldquoIndicator Properties of the Paper-BillSpread Lessons from Recent Experiencerdquo RevEcon Statistics 80 pp 34ndash44

Galbraith John W and Greg Tkacz 2000 ldquoTesting forAsymmetry in the Link Between the Yield Spreadand Output in the G-7 Countriesrdquo J Int MoneyFinance 19 pp 657ndash72

Gerlach Stefan 1997 ldquoThe information Content of theTerm Structure Evidence for Germanyrdquo EmpiricalEcon 222 pp 161-80

Gertler Mark and Cara S Lown 2000 ldquoTheInformation in the High Yield Bond Spread for theBusiness Cycle Evidence and Some ImplicationsrdquoNBER work paper 7549

Goodhart Charles and Boris Hofmann 2000a ldquoDoAsset Prices Help to Predict Consumer PriceInflationrdquo Manchester School Supplement 68 pp122ndash40

mdashmdashmdash 2000b ldquoFinancial Variables and the Conductof Monetary Policyrdquo Sveriges Riksbank work paper 112

Gordon Robert J 1982 ldquoInflation Flexible ExchangeRates and the Natural Rate of Unemploymentrdquo inMN Baily (ed) Workers Jobs and InflationBrookings Institution

mdashmdashmdash 1997 ldquoThe Time-Varying NAIRU and itsImplications for Economic Policyrdquo J EconPerspectives 111 pp 11ndash32

mdashmdashmdash 1998 ldquoFoundations of the GoldilocksEconomy Supply Shocks and the Time-VaryingNAIRUrdquo Brookings Papers Econ Activity 19982pp 297ndash333

Granger Clive WJ and Ramu Ramanathan 1984ldquoImproved Methods for Combining Forecastsrdquo JForecasting 3 pp 197ndash204

Guo Hui 2002 ldquoStock Market Returns Volatility andFuture Outputrdquo Fed Reserve Bank St Louis Rev845 pp 75ndash85

Hafer Rik W and Ali M Kutan 1992 ldquoMoney InterestRates and Output Another Lookrdquo Southern IllinoisU Edwardsville econ dept work paper 92-0303

Hamilton James D and Dong Heon Kim 2002 ldquoARe-Examination of the Predictability of EconomicActivity using the Yield Spreadrdquo J Money Creditand Banking 34 pp 340ndash360

Harvey Andrew C and Albert Jaeger 1993ldquoDetrending Stylized Facts and the Business CyclerdquoJ Applied Econometrics 83 pp 231ndash48

Harvey Campbell R 1988 ldquoThe Real Term Structureand Consumption Growthrdquo J Finan Econ 22 pp305ndash333

mdashmdashmdash 1989 ldquoForecasts of Economic Growth fromthe Bond and Stock Marketsrdquo Finan Analysts J455 pp 38ndash45

mdashmdashmdash 1991 ldquoThe Term Structure and WorldEconomic Growthrdquo J Fixed Income June pp 7ndash19

mdashmdashmdash 1993 ldquoThe Term Structure ForecastsEconomic Growthrdquo Finan Analysts J 493 pp 6ndash8

Haubrich Joseph G and Ann M Dombrosky 1996

ldquoPredicting Real Growth Using the Yield Curverdquo FedReserve Bank Cleveland Econ Rev 321 pp 26ndash34

Hodrick Robert and Edward Prescott 1981 ldquoPost-warUS Business Cycles An Empirical Investigationrdquowork paper Carnegie-Mellon U printed in JMoney Credit and Banking 29 (1997) pp 1-16

Hu Zuliu 1993 ldquoThe Yield Curve and Real ActivityrdquoIMF Staff Papers 40 pp 781ndash806

Jaditz Ted Leigh A Riddick and Chera L Sayers1998 ldquoMultivariate Nonlinear Forecasting UsingFinancial Information to Forecast the Real SectorrdquoMacroeconomic Dynamics 2 pp 369ndash382

Jorion Philippe and Frederic S Mishkin 1991 ldquoAMulti-Country Comparison of Term StructureForecasts at Long Horizonsrdquo J Finan Econ 29pp 59ndash80

Katz Lawrence F and Alan B Krueger 1999 ldquoTheHigh-Pressure US Labor Market of the 1990srdquoBrookings Papers on Econ Activity 19991 pp 1ndash87

Kim C-J and Charles R Nelson 1999 ldquoHas the USEconomy Become More Stable A BayesianApproach Based on a Markov-Switching Model ofthe Business Cyclerdquo The Rev of Econ and Statistics81 pp 608ndash616

King Robert G and Sergio T Rebelo 1993 ldquoLowFrequency Filtering and Real Business Cyclesrdquo JEcon Dynamics Control 17 pp 207ndash31

Kozicki Sharon 1997 ldquoPredicting Real Growth andInflation with the Yield Spreadrdquo Fed Reserve BankKansas City Econ Rev 82 pp 39ndash57

Lahiri Kajal and Jiazhou G Wang 1996 ldquoInterest RateSpreads as Predictors of Business Cyclesrdquo inHandbook of Statistics Methods in Finance Vol 14GS Maddala and CR Rao eds Amsterdam Elsevier

Larocque Guy 1977 ldquoAnalyse drsquoune methode dedesaisonnalisat ion le programme X11 du USBureau of Census version trimestriellerdquo Annale delrsquoINSEE 28

Laubach Thomas 2001 ldquoMeasuring the NAIRUEvidence from Seven Economiesrdquo Rev Econ Sta-tistics 83 pp 218ndash31

Laurent Robert D 1988 ldquoAn Interest Rate-BasedIndicator of Monetary Policyrdquo Fed Reserve BankChicago Econ Perspectives 121 pp 3ndash14

mdashmdashmdash 1989 ldquoTesting the Spreadrdquo Fed Reserve BankChicago Econ Perspectives 13 pp 22ndash34

Lettau Martin and Sydney Ludvigson 2000ldquoUnderstanding Trend and Cycle in Asset Valuesrdquomanuscript research dept Fed Reserve BankNew York

mdashmdashmdash 2001 ldquoConsumption Aggregate Wealth andExpected Stock Returnsrdquo J Finance 56 pp 815ndash49

Ludvigson Sydney and Charles Steindel 1999 ldquoHowImportant is the Stock Market Effect on Consump-tionrdquo Fed Reserve Bank New York Econ Policy Rev52 pp 29ndash51

Marcellino Massimiliano James H Stock and Mark WWatson 2000 ldquoMacroeconomic Forecasting in theEuro Area Country Specific vs Area-Wide Infor-mationrdquo forthcoming Europ Econ Rev

McConnell Margaret M and Gabriel Perez-Quiros2000 ldquoOutput Fluctuations in the United StatesWhat has Changed Since the Early 1980rsquosrdquo AmerEcon Rev 905 pp 1464ndash76

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84

Page 42: Forecasting Output and Inflation: The Role of Asset Prices · able research on forecasting economic activ-ity and inflation using asset prices, where we interpret asset prices broadly

Stock and Watson Forecasting Output and Inflation 829

McCracken Michael W 1999 ldquoAsymptotics for Out-of-Sample Tests of Causalityrdquo manuscript UMissouri-Columbia

McCracken Michael W and Kenneth D West 2002ldquoInference about Predictive Abilityrdquo in Companion toEconomic Forecasting Michael P Clements andDavid F Hendry eds Oxford Blackwell pp 299ndash321

Mishkin Frederic S 1990a ldquoWhat Does the TermStructure Tell Us About Future Inflationrdquo J MonetEcon 25 pp 77ndash95

mdashmdashmdash 1990b ldquoThe Information in the Longer-Maturity Term Structure About Future In-flationrdquoQuart J Econ 55 pp 815ndash28

mdashmdashmdash 1991 ldquoA Multi-Country Study of theInformation in the Term Structure About FutureInflationrdquo J Int Money Finance 19 pp 2ndash22

Mitchell Wesley C and Arthur F Burns 1938Statistical Indicators of Cyclical Revivals NBERBulletin 69 NY Reprinted in Business CycleIndicators GH Moore ed 1961 PrincetonPrinceton U Press

Newbold Paul and David I Harvey 2002 ldquoForecastCombination and Encompassingrdquo in A Companionto Economic Forecasting MP Clements and DFHendry eds Oxford Blackwell Press

Pesaran M Hashem and Spyros Skouras 2002ldquoDecision-Based Methods for Forecast Evaluationrdquoin A Companion to Economic Forecasting M PClements and D F Hendry eds Oxford BlackwellPress

Pivetta Frederic and Ricardo Reis 2002 ldquoThePersistence of Inflation in the United Statesrdquo manu-script Harvard U

Plosser Charles I and K Geert Rouwenhorst 1994ldquoInternational Term Structures and Real EconomicGrowthrdquo J Monet Econ 33 pp 133ndash56

Poon Ser-Huang and Clive WJ Granger 2003ldquoForecasting Volatility in Financial Markets AReviewrdquo J Econ Lit 412 pp 478ndash539

Quandt Richard E 1960 ldquoTests of the Hypothesis thata Linear Regression Obeys Two Separate RegimesrdquoJ Amer Statist Assoc 55 pp 324ndash30

Sims Christopher A 1980 ldquoA Comparison of Interwarand Postwar Cycles Monetarism ReconsideredrdquoAmer Econ Rev 70May pp 250ndash57

Sims Christopher A and Tao Zha 2002 ldquoMacroeco-nomic Switchingrdquo manuscript Princeton U

Smets Frank and Kostas Tsatsaronis 1997 ldquoWhy Doesthe Yield Curve Predict Economic Activity Dissec-ting the Evidence for Germany and the UnitedStatesrdquo BIS work paper 49

Staiger Douglas James H Stock and Mark W Watson1997a ldquoThe NAIRU Unemployment and MonetaryPolicyrdquo J Econ Perspect 11Winter pp 33ndash51

mdashmdashmdash 1997b ldquoHow Precise Are Estimates of theNatural Rate of Unemploymentrdquo in ReducingInflation Motivation and Strategy C Romer and DRomer eds U Chicago Press for NBER pp195ndash242

mdashmdashmdash 2001 ldquoPrices Wages and US NAIRU in the1990srdquo in The Roaring Nineties A Krueger and RSolow eds NY Russell Sage Foundation CenturyFund pp 3ndash60

Stock James H 1998 ldquoDiscussion of GordonrsquoslsquoFoundations of the Goldilocks Economyrsquordquo Brook-ings Papers Econ Act 19982 pp 334ndash41

Stock James H and Mark W Watson 1989 ldquoNewIndexes of Coincident and Leading EconomicIndicatorsrdquo in NBER Macroeconomics Annual 1989OJ Blanchard and S Fischer eds pp 352ndash94

mdashmdashmdash 1996 ldquoEvidence on Structural Instability inMacroeconomic Time Series Relationsrdquo J BusinessEcon Statistics 14 pp 11ndash29

mdashmdashmdash 1998 ldquoMedian Unbiased Estimation ofCoefficient Variance in a Time Varying ParameterModelrdquo J Amer Statistical Assoc 93 pp 349ndash58

mdashmdashmdash 1999a ldquoBusiness Cycle Fluctuations in USMacroeconomic Time Seriesrdquo in Handbook ofMacroeconomics Vol 1 JB Taylor and MWoodford eds pp 3ndash64

mdashmdashmdash 1999b ldquoForecasting Inflationrdquo J Monet Econ44 pp 293ndash335

mdashmdashmdash 1999c ldquoA Comparison of Linear andNonlinear Univariate Models for ForecastingMacroeconomic Time Seriesrdquo in CointegrationCausality and Forecasting A Festschrift for CliveWJ Granger R Engle and H White eds OxfordOxford U Press pp 1ndash44

mdashmdashmdash 2001 ldquoForecasting Output and Inflation TheRole of Asset Pricesrdquo NBER work paper 8180

mdashmdashmdash 2002 ldquoHas the Business Cycle Changed andWhyrdquo NBER Macroeconomics Annual 2002Cambridge MIT Press

mdashmdashmdash 2003a Introduction to Econometrics BostonAddison Wesley Longman

mdashmdashmdash 2003b ldquoHow Did Leading Indicator ForecastsDo During the 2001 Recessionrdquo Fed Reserve BankRichmond Econ Quart 893 forthcoming

Swanson Norman R and Halbert White 1995 ldquoAModel Selection Approach to Assessing theInformation in the Term Structure Using LinearModels and Artificial Neural Networksrdquo J BusinessEcon Statistics 13 pp 265ndash80

mdashmdashmdash 1997 ldquoA Model Selection Approach to Real-Time Macroeconomic Forecasting Using LinearModels and Artificial Neural Networksrdquo Rev EconStatistics 79 pp 540ndash50

Thoma Mark A and Jo Anna Gray 1994 ldquoOn LeadingIndicators Is There a Leading Contenderrdquo manu-script U Oregon

Tkacz Greg 2001 ldquoNeural Network Forecasting ofCanadian GDP Growthrdquo Int J Forecasting 17 pp57ndash69

Wallis Kenneth F 1974 ldquoSeasonal Adjustment andRelations Between Variablesrdquo J Amer StatisticalAssoc 69 pp 18ndash31

West Kenneth D 1996 ldquoAsymptotic Inference aboutPredictive Abilityrdquo Econometrica 64 pp 1067ndash84