advanced topics in analysis of economic and financial...

148
Advanced Topics in Analysis of Economic and Financial Data Using R and SAS 1 ZONGWU CAI a,b E-mail address: [email protected] a Department of Mathematics & Statistics and Department of Economics, University of North Carolina, Charlotte, NC 28223, U.S.A. b Wang Yanan Institute for Studies in Economics, Xiamen University, China December 27, 2007 c 2007, ALL RIGHTS RESERVED by ZONGWU CAI 1 This manuscript may be printed and reproduced for individual or instructional use, but may not be printed for commercial purposes.

Upload: builiem

Post on 01-Apr-2018

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

Advanced Topics in Analysis of Economic and Financial

Data Using R and SAS1

ZONGWU CAIa,b

E-mail address: [email protected] of Mathematics & Statistics and Department of Economics,

University of North Carolina, Charlotte, NC 28223, U.S.A.bWang Yanan Institute for Studies in Economics, Xiamen University, China

December 27, 2007

c©2007, ALL RIGHTS RESERVED by ZONGWU CAI

1This manuscript may be printed and reproduced for individual or instructional use, but maynot be printed for commercial purposes.

Page 2: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

Preface

This is the advanced level of econometrics and financial econometrics with some basictheory and heavy applications. Here, our focuses are on the SKILLS of analyzing real datausing advanced econometric techniques and statistical softwares such as SAS and R. Thisis along the line with our WISE’s spirit “STRONG THEORETICAL FOUNDATION andSKILL EXCELLENCE”. In other words, this course covers basically some advanced topicsin analysis of economic and financial data, particularly in nonlinear time series models andsome models related to economic and financial applications. The topics covered start fromclassical approaches to modern modeling techniques even up to the research frontiers. Thedifference between this course and others is that you will learn step by step how to build amodel based on data (or so-called “let data speak themselves”) through real data examplesusing statistical softwares or how to explore the real data using what you have learned.Therefore, there is no a single book serviced as a textbook for this course so that materialsfrom some books and articles will be provided. However, some necessary handouts, includingcomputer codes like SAS and R codes, will be provided with your help (You might be askedto print out the materials by yourself).

Five or six projects, including the heavy computer works, are assigned throughout theterm. The group discussion is allowed to do the projects, particularly writing the computercodes. But, writing the final report to each project must be in your own language. Copyingeach other will be regarded as a cheating. If you use the R language, similar to SPLUS, youcan download it from the public web site at http://www.r-project.org/ and install it intoyour own computer or you can use PCs at our labs. You are STRONGLY encouraged touse (but not limited to) the package R since it is a very convenient programming languagefor doing statistical analysis and Monte Carol simulations as well as various applications inquantitative economics and finance. Of course, you are welcome to use any one of otherpackages such as SAS, MATLAB, GAUSS, STATA, SPSS and EVIEW. But, I mightnot have an ability of giving you a help if doing so.

Some materials are based on the lecture notes given by Professor Robert H. Shumway, De-partment of Statistics, University of California at Davis and my colleague, Professor StanislavRadchenko, Department of Economics, University of North Carolina at Charlotte, and thebook by Tsay (2005). Some datasets are provided by Professor Robert H. Shumway, Depart-ment of Statistics, University of California at Davis and Professor Phillips Hans Franses atUniversity of Rotterdam, Netherland. I am very grateful to them for providing their lecturenotes and datasets. Finally, I have to express my many thanks to our Master student Ms.Huiqun Ma for writing all SAS codes.

Page 3: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

ii

How to Install R ?

The main package used is R, which is free from R-Project for Statistical Computing. It alsoneeds Ox with G@RCH to fit various GARCH models. Students may use other packages orprograms if they prefer.

Install R

(1) go to the web site http://www.r-project.org/;

(2) click CRAN;

(3) choose a site for downloading, say http://cran.cnr.Berkeley.edu;

(4) click Windows (95 and later);

(5) click base;

(6) click R-2.6.1-win32.exe (Version of November 26, 2007) to save this file first andthen run it to install (Note that the setup program is 29 megabytes and it is updatedevery three months).

The above steps installs the basic R into your computer. If you need to install otherpackages, you need to do the followings:

(7) After it is installed, there is an icon on the screen. Click the icon to get into R;

(8) Go to the top and find packages and then click it;

(9) Go down to Install package(s)... and click it;

(10) There is a new window. Choose a location to download packages, say USA(CA1),move mouse to there and click OK;

(11) There is a new window listing all packages. You can select any one of packages andclick OK, or you can select all of them and then click OK.

Install Ox?

OxMetricstm is a family of of software packages providing an integrated solution for theeconometric analysis of time series, forecasting, financial econometric modelling, or statisticalanalysis of cross-section and panel data. OxMetrics consists of a front-end program calledOxMetrics, and individual application modules such as PcGive, STAMP, etc.

OxMetrics Enterprisetm is a single product that includes all the important components:OxMetrics desktop, G@RCH, Ox Professional, PcGive and STAMP.

To install Ox Console, please download the filehttp://www.math.uncc.edu/˜ zcai/installation−Ox−[email protected] and follow the steps.

Page 4: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

iii

Data Analysis and Graphics Using R – An Introduction (109 pages)

I encourage you to download the file r-notes.pdf (109 pages) which can be downloadedfrom http://www.math.uncc.edu/˜ zcai/r-notes.pdf and learn it by yourself. Pleasesee me if any questions.

Page 5: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

Contents

1 Review of Multiple Regression Models 11.1 Least Squared Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Model Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.1 Box-Cox Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 31.2.2 Reading Materials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Computer Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Classical and Modern Model Selection Methods 122.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122.2 Subset Approaches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.3 Sequential Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.4 Likelihood Based-Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172.5 Cross-Validation and Generalized Cross-Validation . . . . . . . . . . . . . . . 192.6 Penalized Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202.7 Implementation in R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.7.1 Classical Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.7.2 LASSO Type Methods . . . . . . . . . . . . . . . . . . . . . . . . . . 232.7.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.8 Computer Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322.9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

3 Regression Models With Correlated Errors 383.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.2 Nonparametric Models with Correlated Errors . . . . . . . . . . . . . . . . . 473.3 Predictive Regression Models . . . . . . . . . . . . . . . . . . . . . . . . . . 483.4 Computer Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 483.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

4 Estimation of Covariance Matrix 534.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534.2 Details (see the paper by Zeileis) . . . . . . . . . . . . . . . . . . . . . . . . 574.3 Computer Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

iv

Page 6: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CONTENTS v

5 Seasonal Time Series Models 595.1 Characteristics of Seasonality . . . . . . . . . . . . . . . . . . . . . . . . . . 595.2 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625.3 Nonlinear Seasonal Time Series Models . . . . . . . . . . . . . . . . . . . . . 715.4 Computer Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6 Robust and Quantile Regressions 806.1 Robust Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 806.2 Quantile Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 816.3 Computer Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 836.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

7 How to Analyze Boston House Price Data? 867.1 Description of Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 867.2 Analysis Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

7.2.1 Linear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 877.2.2 Nonparametric Models . . . . . . . . . . . . . . . . . . . . . . . . . . 88

7.3 Computer Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 927.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

8 Value at Risk 968.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 968.2 R Commends (R Menu) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

8.2.1 Generalized Pareto Distribution . . . . . . . . . . . . . . . . . . . . . 1018.2.2 Background of the Generalized Pareto Distribution . . . . . . . . . . 102

8.3 Reading Materials I and II (see Handouts) . . . . . . . . . . . . . . . . . . . 1038.4 New Developments (Nonparametric Approaches) . . . . . . . . . . . . . . . . 103

8.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1038.4.2 Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1078.4.3 Nonparametric Estimating Procedures . . . . . . . . . . . . . . . . . 1088.4.4 Real Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

8.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

9 Long Memory Models and Structural Changes 1199.1 Long Memory Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

9.1.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1199.1.2 Spectral Density . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1219.1.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

9.2 Related Problems and New Developments . . . . . . . . . . . . . . . . . . . 1259.2.1 Long Memory versus Structural Breaks . . . . . . . . . . . . . . . . . 1269.2.2 Testing for Breaks (Instability) . . . . . . . . . . . . . . . . . . . . . 1279.2.3 Long Memory versus Trends . . . . . . . . . . . . . . . . . . . . . . . 133

9.3 Computer Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1339.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

Page 7: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

List of Tables

2.1 AICC values for ten models for the recruits series . . . . . . . . . . . . . . . 30

9.1 Critical Values of the QLR statistic with 15% Trimming . . . . . . . . . . . 129

vi

Page 8: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

List of Figures

1.1 Scatterplot with regression line and lowess smoothed curve. . . . . . . . . . 91.2 Scatterplot with regression line and both lowess and loess smoothed curves

as well as the Theil-Sen estimated line. . . . . . . . . . . . . . . . . . . . . . 91.3 Residual plots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101.4 (a) Leverage plot. (b) Influential plot. (c) Plots without observation #34. . . 10

2.1 Monthly SOI (left) and simulated recruitment (right) from a model (n=453months, 1950-1987). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.2 The SOI series (black solid line) compared with a 12 point moving average(red thicker solid line). The left panel: original data and the right panel:filtered series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.3 Multiple lagged scatterplots showing the relationship between SOI and thepresent (xt) versus the lagged values (xt+h) at lags 1 ≤ h ≤ 16. . . . . . . . . 27

2.4 Autocorrelation functions of SOI and recruitment and cross correlation func-tion between SOI and recruitment. . . . . . . . . . . . . . . . . . . . . . . . 28

2.5 Multiple lagged scatterplots showing the relationship between the SOI at timet+ h, say xt+h (x-axis) versus recruits at time t, say yt (y-axis), 0 ≤ h ≤ 15. 28

2.6 Multiple lagged scatterplots showing the relationship between the SOI at timet, say xt (x-axis) versus recruits at time t+ h, say yt+h (y-axis), 0 ≤ h ≤ 15. 29

2.7 Partial autocorrelation functions for the SOI (left panel) and the recruits(right panel) series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.8 ACF of residuals of AR(1) for SOI (left panel) and the plot of AIC and AICCvalues (right panel). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.1 Quarterly earnings for Johnson & Johnson (4th quarter, 1970 to 1st quarter,1980, left panel) with log transformed earnings (right panel). . . . . . . . . . 41

3.2 Autocorrelation functions (ACF) and partial autocorrelation functions (PACF)for the detrended log J&J earnings series (top two panels)and the fittedARIMA(0, 0, 0) × (1, 0, 0)4 residuals. . . . . . . . . . . . . . . . . . . . . . . 41

3.3 Time plots of U.S. weekly interest rates (in percentages) from January 5, 1962to September 10, 1999. The solid line (black) is the Treasury 1-year constantmaturity rate and the dashed line the Treasury 3-year constant maturity rate(red). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

vii

Page 9: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

LIST OF FIGURES viii

3.4 Scatterplots of U.S. weekly interest rates from January 5, 1962 to September10, 1999: the left panel is 3-year rate versus 1-year rate, and the right panelis changes in 3-year rate versus changes in 1-year rate. . . . . . . . . . . . . 43

3.5 Residual series of linear regression Model I for two U.S. weekly interest rates:the left panel is time plot and the right panel is ACF. . . . . . . . . . . . . . 43

3.6 Time plots of the change series of U.S. weekly interest rates from January 12,1962 to September 10, 1999: changes in the Treasury 1-year constant maturityrate are in denoted by black solid line, and changes in the Treasury 3-yearconstant maturity rate are indicated by red dashed line. . . . . . . . . . . . . 44

3.7 Residual series of the linear regression models: Model II (top) and Model III(bottom) for two change series of U.S. weekly interest rates: time plot (left)and ACF (right). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.1 US Retail Sales Data from 1967-2000. . . . . . . . . . . . . . . . . . . . . . . 605.2 Four-weekly advertising expenditures on radio and television in The Nether-

lands, 1978.01 − 1994.13. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615.3 Number of live births 1948(1)−1979(1) and residuals from models with a first

difference, a first difference and a seasonal difference of order 12 and a fittedARIMA(0, 1, 1) × (0, 1, 1)12 model. . . . . . . . . . . . . . . . . . . . . . . . 64

5.4 Autocorrelation functions and partial autocorrelation functions for the birthseries (top two panels), the first difference (second two panels) an ARIMA(0, 1, 0)×(0, 1, 1)12 model (third two panels) and an ARIMA(0, 1, 1) × (0, 1, 1)12 model(last two panels). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

5.5 Autocorrelation functions (ACF) and partial autocorrelation functions (PACF)for the log J&J earnings series (top two panels), the first difference (sec-ond two panels), ARIMA(0, 1, 0) × (1, 0, 0)4 model (third two panels), andARIMA(0, 1, 1) × (1, 0, 0)4 model (last two panels). . . . . . . . . . . . . . . 68

5.6 ACF and PACF for ARIMA(0, 1, 1)×(0, 1, 1)4 model (top two panels) and theresidual plots of ARIMA(0, 1, 1)×(1, 0, 0)4 (left bottom panel) and ARIMA(0, 1, 1)×(0, 1, 1)4 model (right bottom panel). . . . . . . . . . . . . . . . . . . . . . . 69

5.7 Monthly simple return of CRSP Decile 1 index from January 1960 to December2003: Time series plot of the simple return (left top panel), time series plotof the simple return after adjusting for January effect (right top panel), theACF of the simple return (left bottom panel), and the ACF of the adjustedsimple return. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6.1 Different loss functions for Quadratic, Huber’s ρc(·), ρc,0(·), ρ0.05(·), LAD andρ0.95(·), where c = 1.345. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

7.1 The results from model (7.1). . . . . . . . . . . . . . . . . . . . . . . . . . . 897.2 (a) Residual plot for model (7.1). (b) Plot of g1(x6) versus x6. (c) Residual

plot for model (7.2). (d) Density estimate of Y . . . . . . . . . . . . . . . . . 907.3 Boston Housing Price Data: Displayed in (a)-(d) are the scatter plots of the

house price versus the covariates X13, X6, X1 and X∗1 , respectively. . . . . . 91

Page 10: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

LIST OF FIGURES ix

7.4 Boston Housing Price Data: The plots of the estimated coefficient functionsfor three quantiles τ = 0.05 (solid line) for model (7.4), τ = 0.50 (dashedline), and τ = 0.95 (dotted line), and the mean regression (dot-dashed line):a0,τ (u) and a0(u) versus u in (a), a1,τ (u) and a1(u) versus u in (b), and a2,τ (u)and a2(u) versus u in (c). The thick dashed lines indicate the 95% point-wiseconfidence interval for the median estimate with the bias ignored. . . . . . . 92

8.1 (a) 5% CVaR estimate for DJI index. (b) 5% CES estimate for DJI index. . 1148.2 (a) 5% CVaR estimates for IBM stock returns. (b) 5% CES estimates for

IBM stock returns index. (c) 5% CVaR estimates for three different values oflagged negative IBM returns (−0.275, −0.025, 0.325). (d) 5% CVaR estimatesfor three different values of lagged negative DJI returns (−0.225, 0.025, 0.425).(e) 5% CES estimates for three different values of lagged negative IBM returns(−0.275, −0.025, 0.325). (f) 5% CES estimates for three different values oflagged negative DJI returns (−0.225, 0.025, 0.425). . . . . . . . . . . . . . . 115

9.1 Sample autocorrelation function of the absolute series of daily simple returnsfor the CRSP value-weighted (left top panel) and equal-weighted (right toppanel) indexes. Sample partial autocorrelation function of the absolute se-ries of daily simple returns for the CRSP value-weighted (left middle panel)and equal-weighted (right middle panel) indexes. The log smoothed spectraldensity estimation of the absolute series of daily simple returns for the CRSPvalue-weighted (left bottom panel) and equal-weighted (right bottom panel)indexes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

9.2 Break testing results for the Nile River data: (a) Plot of F -statistics. (b) Thescatterplot with the breakpoint. (c) Plot of the empirical fluctuation processwith linear boundaries. (d) Plot of the empirical fluctuation process withalternative boundaries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

9.3 Break testing results for the oil price data: (a) Plot of F -statistics. (b) Scat-terplot with the breakpoint. (c) Plot of the empirical fluctuation process withlinear boundaries. (d) Plot of the empirical fluctuation process with alterna-tive boundaries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

9.4 Break testing results for the consumer price index data: (a) Plot of F -statistics. (b) Scatterplot with the breakpoint. (c) Plot of the empiricalfluctuation process with linear boundaries. (d) Plot of the empirical fluctua-tion process with alternative boundaries. . . . . . . . . . . . . . . . . . . . . 132

Page 11: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

Chapter 1

Review of Multiple Regression Models

1.1 Least Squared Estimation

We begin our discussion of univariate and multivariate regression (time series) models by

considering the idea of a simple regression model, which we have met before in other contexts

such as statistics or econometrics courses. All of the multivariate methods follow, in some

sense, from the ideas involved in simple univariate linear regression. In this case, we assume

that there is some collection of fixed known functions of time, say zt1, zt2, . . . , ztq that are

influencing our output yt which we know to be random. We express this relation between

the inputs (predictors or independent variables or covariates or exogenous variables) and

outputs (dependent or response variable) as

yt = β1 zt1 + β2 zt2 + · · · + βq ztq + et (1.1)

at the time points t = 1, 2, · · · , n, where β1, · · · , βq are unknown fixed regression

coefficients and et is a random error or noise, assumed to be white noise; this means that

the observations have zero means, equal variances σ2 and are (uncorrelated) independent.

We traditionally assume also that the white noise series, et, is Gaussian or normally

distributed. Finally, of course, the basic assumption is E(yt|zt1, · · · , ztq) is a linear function

as β1 zt1 + β2 zt2 + · · · + βq ztq.

Question: If at leat one of those (four) assumptions is violated, what should we do?

The linear regression model described by (1.1) can be conveniently written in slightly

more general matrix notation by defining the column vectors zt = (zt1, . . . , ztq)T and

β = (β1, . . . , βq)T so that we write (1.1) in the alternate form

yt = βTzt + et. (1.2)

1

Page 12: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 1. REVIEW OF MULTIPLE REGRESSION MODELS 2

To find estimators for β and σ2, it is natural to determine the coefficient vector β minimizing

the sum of squared errors (SSE)∑n

t=1 e2t with respect to β. Of course, if the distribution

of et is known, it should be to maximize the likelihood. This yields least squares (LSE)

or maximum likelihood estimator (MLE) β and the maximum likelihood estimator for σ2

which is proportional to the unbiased

σ2 =1

n− q

n∑

t=1

(yt − β

Tzt

)2

. (1.3)

Note that the LSE is exactly same as the MLE when the distribution of et is normal

(WHY?). An alternate way of writing the model (1.2) is as

y = Zβ + e, (1.4)

where ZT = (z1, z2, · · · , zn) is a q×n matrix composed of the values of the input variables

at the observed time points and yT = (y1, y2, · · · , yn) is the vector of observed outputs

with the errors stacked in the vector e = (e1, e2, · · · , en)T . The ordinary least squares

estimators β are the solutions to the normal equations

ZT Zβ = Zy.

You need not be concerned as to how the above equation is solved in practice as all computer

packages have efficient software for inverting the q × q matrix ZT Z to obtain

β =(ZT Z

)−1Zy. (1.5)

An important quantity that all software produces is a measure of uncertainty for the esti-

mated regression coefficients, say

Cov(β)

= σ2(ZT Z

)−1 ≡ σ2 C ≡ σ2 (cij). (1.6)

Then, Cov(βi, βj) = σ2 cij and a 100(1 − α)% confidence interval for βi is

βi ± tdf (α/2) σ√cii, (1.7)

where tdf (α/2) denotes the upper 100(1 − α)% point on a t distribution with df degrees of

freedom. What is the df for our case?

Question: If at leat one of those (four) assumptions is violated, are equations (1.6) and

(1.7) still true? If not, how to fix or modify them?

Page 13: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 1. REVIEW OF MULTIPLE REGRESSION MODELS 3

It seems that it is VERY IMPORTANT to make sure that all assumptions are satisfied.

The question is how to do this task. Well, that is what called model checking or model

diagnostics, discussed in the next section.

1.2 Model Diagnostics

1.2.1 Box-Cox Transformation

If the distribution of the error is not normal, a naive way to deal with this problem is to use

a transformation to the dependent variable. A simple and easy way to use a transformation

is the Box-Cox transformation in a regression model. When the errors are heterogeneous

and often non-Gaussian, a Box-Cox power transformation on the dependent variable is a

useful method to alleviate heteroscedasticity when the distribution of the dependent variable

is not known. For situations in which the dependent variable Y is known to be positive, the

following transformation can be used:

Y λ =

Y λ−1

λ, if λ 6= 0,

log(Y ), if λ = 0.

Given the vector of data observations Yini=1, one way to select the power λ is to use the λ

that maximizes the logarithm of the likelihood function

f(λ) = −1

2log[Sn(Y λ)

]− (λ− 1)

n∑

i=1

log(Yi),

where Sn(Y λ) is the sample standard variance of the transformed data Y λi . Generally, we

can take λ to be one of −2 + j∆ (0 ≤ j ≤ 4/∆) to maximize the logarithm of the likelihood

function f(λ).

Note that the Box-Cox transformation approach can be applied to handle the case when

any covariate is nonlinear. For this case, you need to do a transformation to each individual

covariate. When you do the Box-Cox transformation to covariates, you need to minimize

the SSE instead of the likelihood by running a regression. There is no build-in function in

any statistical package so that when implementing the Box-Cox transformation, you have to

write your won code. Another way to choose λ is to use the Q-Q plot. The command in R

for the Q-Q plot is qqnorm() for the Q-Q normal plot or qqplot() for the Q-Q plot of one

versus another.

Page 14: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 1. REVIEW OF MULTIPLE REGRESSION MODELS 4

1.2.2 Reading Materials

See materials from file simple-reg.htm. Also, file r-reference.htm contains references

about how to use R.

1.3 Computer Codes

To fit a multiple regression in R, one can use lm() or glm(); see the followings for details

lm(formula, data, subset, weights, na.action,

method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE,

singular.ok = TRUE, contrasts = NULL, offset, ...)

glm(formula, family = gaussian, data, weights, subset,

na.action, start = NULL, etastart, mustart,

offset, control = glm.control(...), model = TRUE,

method = "glm.fit", x = FALSE, y = TRUE, contrasts = NULL, ...)

to fit a regression model without intercept, you need to use

fit1=lm(y~-1+x1+...x9)

where fit1 is called the objective function containing all outputs you need. If you want to

model diagnostic checking, you need to use

plot(fit1)

For multivariate data, it is usually a good idea to view the data as a whole using the pairwise

scatter plots generated by the pairs() function:

pairs(data)

The smoothing parameter in lowess can be adjusted using the f =argument. The default

is f = 2/3 meaning that 2/3 of the data points are used in calculating the fitted value at

each point. If we set f = 1 we get the ordinary regression line. Specifying f = 1/3 should

yield a choppier curve.

lowess(x, y = NULL, f = 2/3, iter = 3, delta = 0.01 * diff(range(xy$x[o])))

Page 15: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 1. REVIEW OF MULTIPLE REGRESSION MODELS 5

There’s a second loess function in R, loess, that has more options and generates more

output.

loess(formula, data, weights, subset, na.action, model = FALSE,

span = 0.75, enp.target, degree = 2,

parametric = FALSE, drop.square = FALSE, normalize = TRUE,

family = c("gaussian", "symmetric"),

method = c("loess", "model.frame"),

control = loess.control(...), ...)

Example 1.1: We examine fitting a simple linear regression model to data using R. The

data are from Harder and Thompson (1989). As part of a study to investigate reproductive

strategies in plants, these biologists recorded the time spent at sources of pollen and the

proportions of pollen removed by bumblebee queens and honeybee workers pollinating a

species of Erythronium lily. The data set consists of three variables. (1) removed: proportion

of pollen removed duration; (2) duration of visit in seconds; and (3) code: 1=bumblebee

queens, 2=honeybee workers. The response variable of interest is removed, the predictor is

duration. Because removed is a proportion, it offers special challenges for analysis. We will

ignore these issues for the moment. We will also look only at the queen bumblebees. I will

leave consideration of the full collection of bees as a homework exercise.

The following R code is for Example 1.1, can be found in the file ex1-1.r.

# 10-12-2006

graphics.off()

beepollen<-read.table(’c:/res-teach/xiamen12-06/data/ex1-1.txt’,header=T)

#beepollen

queens<-beepollen[beepollen[,3]==1,]

attach(queens)

#print(names(queens))

postscript(file="c:\\res-teach\\xiamen12-06\\figs\\fig-1.1.eps",

Page 16: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 1. REVIEW OF MULTIPLE REGRESSION MODELS 6

horizontal=F,width=6,height=6)

#win.graph()

par(bg="light yellow")

plot(duration,removed)

bee.reg<-lm(removed~duration)

bee.reg

print(anova(bee.reg))

print(summary(bee.reg))

print(names(bee.reg))

abline(bee.reg,lty=1,lwd=3)

lowess(duration,removed)

lines(lowess(duration,removed),lty=2,col=2,lwd=2)

legend(35,.25,c(’regression’,’loess’),lty=c(1,2),

col=c(1,2),lwd=c(2,2),bty="n")

dev.off()

postscript(file="c:/res-teach/xiamen12-06/figs/fig-1.2.eps",

horizontal=F,width=6,height=6)

#win.graph()

par(bg="light grey")

plot(duration,removed)

abline(bee.reg,lty=1,lwd=3)

lines(lowess(duration,removed,f=1/3),lty=2,col=6,lwd=2)

loess.bee<-loess(removed~duration,family="symmetric")

print(names(loess.bee))

lines(sort(loess.bee$x),loess.bee$fitted[order(loess.bee$x)],

col=4,lty=2,lwd=2)

tsp1reg<-function(x,y)

#

# Compute the Theil-Sen regression estimator.

# Only a single predictor is allowed in this version

Page 17: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 1. REVIEW OF MULTIPLE REGRESSION MODELS 7

#

temp<-matrix(c(x,y),ncol=2)

temp<-elimna(temp) # Remove any pairs with missing values

x<-temp[,1]

y<-temp[,2]

ord<-order(x)

xs<-x[ord]

ys<-y[ord]

vec1<-outer(ys,ys,"-")

vec2<-outer(xs,xs,"-")

v1<-vec1[vec2>0]

v2<-vec2[vec2>0]

slope<-median(v1/v2)

coef<-0

coef[1]<-median(y)-slope*median(x)

coef[2]<-slope

res<-y-slope*x-coef[1]

list(coefficients=coef,residuals=res)

elimna<-function(m)

#

# remove any rows of data having missing values

#

ikeep<-c(1:nrow(m))

for(i in 1:nrow(m))if(sum(is.na(m[i,])>=1))ikeep[i]<-0

elimna<-m[ikeep[ikeep>=1],]

elimna

ts.reg<-tsp1reg(duration,removed)

abline(reg=ts.reg,col=2,lty=1,lwd=4)

Page 18: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 1. REVIEW OF MULTIPLE REGRESSION MODELS 8

dev.off()

########################################################

postscript(file="c:\\res-teach\\xiamen12-06\\figs\\fig-1.3.eps",

horizontal=F,width=6,height=6)

#win.graph()

par(mfrow=c(2,2),mex=0.4,bg="light pink")

plot(bee.reg)

dev.off()

postscript(file="c:\\res-teach\\xiamen12-06\\figs\\fig-1.4.eps",

horizontal=F,width=6,height=6)

#win.graph()

par(mfrow=c(2,2),mex=0.4,bg="light blue")

h<-hatvalues(bee.reg)

plot(h,type=’h’)

abline(h=c(2,3)*mean(h),col=4,lty=2)

plot(c(0,.6),c(-3,2),xlab=’leverage’,

ylab=’studentized residual’,type=’n’)

cook<-sqrt(cooks.distance(bee.reg))

points(hatvalues(bee.reg),rstudent(bee.reg),

cex=10*cook/max(cook))

abline(h=c(-2,0,2),lty=2,col=2)

abline(v=c(2,3)*mean(hatvalues(bee.reg)),lty=2,col=4)

#identify(hatvalues(bee.reg),rstudent(bee.reg))

# take long time to do this

bee.reg1<-update(bee.reg,subset=-34)

bee.reg2<-update(bee.reg,subset=-c(14,34))

plot(duration,removed)

abline(bee.reg,col=1,lty=1,lwd=3)

abline(bee.reg1,col=2,lty=2,lwd=3)

Page 19: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 1. REVIEW OF MULTIPLE REGRESSION MODELS 9

abline(bee.reg2,col=3,lty=2,lwd=3)

abline(reg=ts.reg,col=4,lty=1,lwd=3)

legend(25,.3,c(’all observations’,’without #34’,’without #14, #34’,

’Theil-Sen’),lty=c(1,2,2,1),col=1:4,lwd=rep(2,4),bty=’n’)

dev.off()

print(c(’I am done’))

0 10 20 30 40 50 60

0.10.2

0.30.4

0.50.6

0.7

duration

remove

d

regressionloess

Figure 1.1: Scatterplot with regression line and lowess smoothed curve.

0 10 20 30 40 50 60

0.10.2

0.30.4

0.50.6

0.7

duration

remove

d

Figure 1.2: Scatterplot with regression line and both lowess and loess smoothed curves aswell as the Theil-Sen estimated line.

Page 20: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 1. REVIEW OF MULTIPLE REGRESSION MODELS 10

0.3 0.4 0.5 0.6 0.7

−0.3

−0.1

0.10.2

0.3

Fitted values

Resid

uals

Residuals vs Fitted

35

4

28

−2 −1 0 1 2

−2−1

01

2

Theoretical Quantiles

Stan

dard

ized r

esidu

als

Normal Q−Q

34

35

4

0.3 0.4 0.5 0.6 0.7

0.00.5

1.01.5

Fitted values

Stan

dard

ized r

esidu

als

Scale−Location34

354

0.0 0.1 0.2 0.3 0.4 0.5

−2−1

01

2

Leverage

Stan

dard

ized r

esidu

als

Cook’s distance

1

0.5

0.5

1

Residuals vs Leverage

34

1

14

Figure 1.3: Residual plots.

0 5 10 15 20 25 30 35

0.10.2

0.30.4

0.5

Index

h

0.0 0.1 0.2 0.3 0.4 0.5 0.6

−3−2

−10

12

leverage

stude

ntize

d res

idual

0 10 20 30 40 50 60

0.10.2

0.30.4

0.50.6

0.7

duration

remo

ved

all observationswithout #34without #14, #34Theil−Sen

Figure 1.4: (a) Leverage plot. (b) Influential plot. (c) Plots without observation #34.

Page 21: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 1. REVIEW OF MULTIPLE REGRESSION MODELS 11

1.4 References

Harder, L.D. and J. D. Thompson (1989). Evolutionary options for maximizing pollendispersal of animal-pollinated plants. American Naturalist, 133. 323-344.

Page 22: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

Chapter 2

Classical and Modern Model SelectionMethods

2.1 Introduction

Given a possibly large set of potential predictors, which ones do we include in our model?

Suppose X1, X2, · · · is a pool of potential predictors. The model with all predictors is given

by

Y = β0 + β1X1 + β2X2 + · · · + ε,

is the most general model. It holds even if some of the individual βj’s are zero. But if some

βj’s zero or close to zero, it is better to omit those Xj’s from the model. Reasons why you

should omit variables whose coefficients are close to zero:

(a) Parsimony principle:

Given two models that perform equally well in terms of prediction, one should choose

the model that is more parsimonious (simple).

(b) Prediction principle:

The model should give predictions that are as accurate as possible, not just for current

observation, but for future observations as well. Including unnecessary predictors

can apparently improve prediction for the current data, but can harm pre-

diction for future data. Note that the sum of squares error (SSE) never increases

as we add more predictors.

Next, we discuss all possible methods available in the literature.

12

Page 23: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 13

2.2 Subset Approaches

The all-possible-regressions procedure calls for considering all possible subsets of the pool of

potential predictors and identifying for detailed examination a few good sub-sets according to

some criterion. The purpose of all-possible-regressions approach is identifying a small group

of regression models that are good according to a specified criterion (summary statistic) so

that a detailed examination can be made of these models leading to the selection of the final

regression model to be employed. The main problem of this approach is computationally

expensive. For example, with 10 predictors, we need to investigate 210 = 1024 potential

regression models. With the aid of modern computing power, this computation is possible.

But still the number of 1024 possible models to examine carefully would be an overwhelming

task for a data analyst.

Different criteria for comparing the regression models may be used with the all-possible-

regressions selection procedure. We discuss several summary statistics:

(i) R2p (or SSEp)

(ii) R2adj;p (or MSEp)

(iii) Cp

(iv) PRESSp

(v) Sequential Methods

(vi) AIC type criteria

We shall denote the number of all potential predictors in the pool by K−1. Hence including

an intercept parameter β0, we have K potential parameters. The number of predictors in a

subset will be denoted by p− 1, as always, so that there are p parameters in the regression

function for this subset of predictors. Thus we have 1 ≤ p ≤ K. Now, we discuss each one

in detail.

1. R2p (or SSEp)

R2p indicates that there are p parameters (or, p − 1 predictors) in a regression model.

The coefficient of multiple determination R2p is defined as

R2p = 1 − SSEp

SSTO.

Page 24: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 14

• It measures the proportion of variance of Y explained by p− 1 predictors.

• R2p always goes up as we add more predictors.

• R2p varies inversely with SSEp because SSTO is constant for all possible regression

models. That is, choosing the model with the largest R2p is equivalent to choosing the

model with smallest SSEp.

• R2p might not be a good criterion. WHY?

2. R2adj;p (or MSEp)

One often considers models with a large R2p value. However, R2

p always increases with

the number of predictors. Hence it can not be used to compare models with different

sizes. The adjusted coefficient of multiple determination R2adj;p has been suggested as

an alternative criterion:

R2adj;p = 1 − SSEp/(n− p)

SSTO/(n− 1)= 1 −

(n− 1

n− p

)SSEp

SSTO= 1 − MSEp

SSTO/(n− 1).

• It is like R2p but with a penalty for adding unnecessary variables. R2

adj;p can go down

when a useless predictor is added. It can be even negative.

• R2adj;p varies inversely with MSEp because SSTO/(n − 1) is constant for all possible

regression models. That is, choosing the model with the largest R2adj;p is equivalent to

choosing the model with smallest MSEp.

• R2p is useful when comparing models of the same size, while R2

adj;p (or Cp) is used to

compare models with different sizes.

• R2adj:p is better than R2

p.

3. Mallows Cp

The Mallows Cp is concerned with the total mean squared error of the n fitted values

for each subset regression model. The mean squared error concept involves the total

error in each fitted value:

Yi − µi = Yi − E(Yi)︸ ︷︷ ︸random error

+E(Yi) − µi︸ ︷︷ ︸bias

,

Page 25: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 15

where µi is the true mean response at ith observation. The means squared error for Yi

is defined as the expected value of the square of the total error in the above. It can be

shown that

MSE(Yi) = E

(Yi − µi)2

= Var(Yi) +[Bias(Yi)

]2,

where Bias(Yi) = E(Yi) − µi. The total mean square error for all n fitted values Yi is

the sum over the observation i:

n∑

i=1

MSE(Yi) =n∑

i=1

Var(Yi) +n∑

i=1

[Bias(Yi)

]2.

It can be shown that

n∑

i=1

Var(Yi) = p σ2 andn∑

i=1

[Bias(Yi)

]2= (n− p)[E(S2

p) − σ2],

where S2p is the MSE from the current model. Using this, we have

n∑

i=1

MSE(Yi) = p σ2 + (n− p)[E(S2p) − σ2], (2.1)

Dividing (2.1) by σ2, we make it scale-free:

n∑

i=1

MSE(Yi)

σ2= p+ (n− p)

E(S2p) − σ2

σ2,

If the model does not fit well, then S2p is a biased estimate of σ2. We can estimate E(S2

p)

by MSEp and estimate σ2 by the MSE from the maximal model (the largest model we

can consider), i.e., σ2 = MSEK−1 = MSE(X1, . . . , XK−1). Using the estimators for

E(S2p) and σ2 gives

Cp = p+ (n− p)MSEp − MSE(X1, . . . , XK−1)

MSE(X1, . . . , XK−1)=

SSEp

MSE(X1, . . . , XK−1)− (n− 2p).

• Small Cp is a good thing. A small value of Cp indicates that the model is relatively

precise (has small variance) in estimating the true regression coefficients and predicting

future responses. This precision will not improve much by adding more predictors.

Look for models with small Cp.

• If we have enough predictors in the regression model so that all the significant predictors

are included, then MSEp ≈ MSE(X1, . . . , XK−1) and it follows that Cp ≈ p.

Page 26: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 16

• Thus Cp close to p is evidence that the predictors in the pool of potential predictors

(X1, . . . , XK−1) but not in the current model, are not important.

• Models with considerable lack of fit have values of Cp larger than p.

• The Cp can be used to compare models with different sizes.

• If we use all the potential predictors, then Cp = K.

4. PRESSp

The PRESS (prediction sum of squares) is defined as

PRESS =n∑

i=1

ε2(i),

where ε(i) is called PRESS (prediction sum of squares) residual for the the ith observa-

tion, similar to the jackknife residual. The PRESS residual is defined as ε(i) = Yi− Y(i),

where Y(i) is the fitted value obtained by leaving the ith observation. Models with small

PRESSp fit well in the sense of having small prediction errors. PRESSp can be calcu-

lated without fitting the model n times, each time deleting one of the n cases. One

can show that

ε(i) =εi

1 − hii

,

where hii is the ith diagonal element of H = X(XTX)−1XT (hat matrix).

2.3 Sequential Methods

1. Forward selection

(i) Start with the null model.

(ii) Add the significant variable if p-value is less than penter, (equivalently, F is larger than

Fenter).

(iii) Continue until no more variables enter the model.

2. Backward elimination

Page 27: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 17

(i) Start with the full model.

(ii) Eliminate the least significant variable whose p-value is larger than premove, (equiva-

lently, F is smaller than Fremove).

(iii) Continue until no more variables can be discarded from the model.

3. Stepwise selection

(i) Start with any model.

(ii) Check each predictor that is currently in the model. Suppose the current model con-

tains X1, . . . , Xk. Then F statistic for Xi is

F =SSE(X1, . . . , Xi−1, Xi+1, . . . , Xk) − SSE(X1, . . . , Xk)

MSE(X1, . . . , Xk)∼ F (1;n− k − 1).

Eliminate the least significant variable whose p-value is larger than premove, (equiva-

lently, F is smaller than Fremove).

(iii) Continue until no more variables can be discarded from the model.

(iv) Add the significant variable if p-value is less than penter, (equivalently, F is larger than

Fenter).

(v) Go to step (ii)

(vi) Repeat until no more predictors can be entered and no more can be discarded.

2.4 Likelihood Based-Criteria

The following is based on Akaike’s approach Akaike (1973) and subsequent papers; see also

recent book by Burnham and Anderson (2003).

Suppose that f(y) : true model (unknown) giving rise to data ( is a vector of data) and

g(y, θ) : candidate model (parameter vector). Want to find a model g(y, θ) “close to” f(y).

The Kullback-Leibler discrepancy (K-L distance):

K(f, g) = Ef

[log

(f(Y )

g(Y, θ)

)].

Page 28: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 18

This is a measure of how “far” model g is from model f (with reference to model f).

Properties:

K(f, g) ≥ 0 K(f, g) = 0 ⇐⇒ f(·) = g(·).

Of course, we can never know how far our model g is from f . But Akaike (1973) showed

that we might be able to estimate something almost as good.

Suppose we have two models under consideration: g(y, θ) and h(y, φ). Akaike (1973)

showed that we can estimate K(f, g)−K(f, h). It turns out that the difference of maximized

log-likelihoods, corrected for a bias, estimates the difference of K-L distances. The maximized

likelihoods are, Lg(y, θ) and Lh(y, φ), where θ and φ are the ML estimates of the parameters.

Akaike’s result: [log(Lg)− q]− [log(Lh)− r] is an asymptotically unbiased estimate (i.e. bias

approaches zero as sample size increases) of K(f, g) − K(f, h). Here q is the number of

parameters estimated in θ (model g) and r is the number of parameters estimated in φ

(model h). The price of parameters: the likelihoods in the above expression are penalized

by the number of parameters.

The Akaike Information Criterion (AIC) for model g:

AIC = −2 log(Lg) + 2 q. (2.2)

A biased correction version of AIC was proposed by Hurvich and Tsai (1989), called AICC ,

defined by

AICC = AIC + 2 q(q + 1)/(n− q − 1) = −2 log(Lg) + 2 qn/(n− q − 1). (2.3)

The difference between AIC and AICC is the penalty term. Intuitively, one can think of

2qn/(n − q − 1) in (2.3) as a penalty term to discourage over-parameterization. Shibata

(1976) suggested that the AIC has a tendency to overestimate parameter q. By comparing

the penalty terms in (2.2) and (2.3), we can see that the factors, 2qn/(n− q− 1) and 2q, for

the AICC and AIC statistics are asymptotically equivalent as n → ∞. The AICC statistic

however has more extreme penalty for larger-order models which counteracts the over-fitting

tendency of the AIC.

Another approach is given by the much older notion of Bayesian statistics. In the Bayesian

approach, we assume that a priori uncertainty about the value of model parameters is rep-

resented by a prior distribution. Upon observing the data, this prior is updated, yielding a

Page 29: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 19

posterior distribution. In order to make inferences about the model (rather than its param-

eters), we integrate across the posterior distribution. Under the assumption that all models

are a priori equally likely (because the Bayesian approach requires model priors as well as

parameter priors), Bayesian model selection chooses the model with highest marginal likeli-

hood. The ratio of two marginal likelihoods is called a Bayes factor (BF), which is a widely

used method of model selection in Bayesian inference. The two integrals in the Bayes factor

are nontrivial to compute unless they form a conjugated family. Monte Carlo methods are

usually required to compute BF, especially for highly parameterized models. A large sample

approximation of BF yields the easily-computable Bayesian information criterion (BIC)

BIC = −2 log(Lg) + q log n. (2.4)

In a sum, both AIC and BIC as well as their generalizations have a similar form as

LC = −2 log(Lg) + λ q, (2.5)

where λ is fixed constant. From (2.2), (2.3), and (2.4), we can see that the BIC statistic has

much more penalty when n is large than AIC and AICC to overcome the over-fitting. Also,

from (2.5), it is easy to see that LC includes AIC, AICC and BIC as a special case.

The recent developments suggest the use of a data adaptive penalty to replace the fixed

penalties λ. See, Bai, Rao and Wu (1999) and Shen and Ye (2002). That is to estimate λ

by data in a complexity form based on a concept of generalized degree of freedom.

2.5 Cross-Validation and Generalized Cross-Validation

The cross validation (CV) is the most commonly used method for model assessment and

selection. The main idea is a direct estimate of extra-sample error. The general version of

CV is to split data into K roughly equal-sized parts and to fit the model to the other K-1

parts and calculate prediction error on the remaining part.

CV =n∑

i=1

(Yi − Y−i)2 (2.6)

where Y−i is the fitted value computed with i-th part of data removed and Yi − Y−i is the

jackknife residual.

Page 30: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 20

A convenient approximation to CV for linear fitting with squared error loss is generalized

cross validation (GCV). A linear fitting method has the following property: Y = S Y , where

Yi is the fitted value with the whole data and S = (sij)n×n is the smoothing (hat) matrix.

For many linear fitting methods with leave-one-out (k = 1), it can be showed easily that

CV =n∑

i=1

(Yi − Y−i)2 =

n∑

i=1

(Yi − Yi

1 − sii

)2

.

Due to the intensive computation, the CV can be approximated by the GCV, defined by

GCV =n∑

i=1

(Yi − Yi

1 − trace(S)/n

)2

=

∑ni=1(Yi − Yi)

2

(1 − trace(S)/n)2. (2.7)

It has been shown that both the CV and GCV methods are very appalling to nonparametric

modeling; see the book by Hastie and Tishirani (1990). It follows from (2.7) that

GCV ≈n∑

i=1

(Yi − Yi)2 (1 + trace(S)/n)2 ≈ σ2

[SSE/σ2 + 2 trace(S)

], (2.8)

where σ2 =∑n

i=1(Yi − Yi)2/n. Therefore, under the normality assumption, GCV is asymp-

totically equivalent to AIC since trace(S) = q.

Recently, the leave-one-out cross-validation method was challenged by Shao (1993). Shao

(1993) claimed that the popular leave-one-out cross-validation method, which is asymptot-

ically equivalent to many other model selection methods such as the AIC, the Cp, and the

bootstrap, is asymptotically inconsistent in the sense that the probability of selecting the

model with the best predictive ability does not converge to 1 as the total number of obser-

vations n → ∞ and he showed that the inconsistency of the leave-one-out cross-validation

can be rectified by using a leave-nν-out cross-validation with nν , the number of observations

reserved for validation, satisfying nν/n→ 1 as n→ ∞.

2.6 Penalized Methods

1. Bridge and Ridge:

Frank and Friedman (1993) proposed the Lq(q > 0) penalized least squares as

n∑

i=1

(Yi −∑

j

βj Xij)2 + λ

j

|βj|q,

which results in the estimator which is called the bridge estimator. If q = 2, the

resulting estimator is called the ridge estimator given by β = (XTX + λ I)−1XTY .

Page 31: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 21

2. LASSO:

Tibshirani (1996) proposed the so-called the least absolute shrinkage and selection

operator (LASSO) which is the minimizer of the following constrained least squares

n∑

i=1

(Yi −∑

j

βj Xij)2 + λ

j

|βj|,

which results in the soft threshing rule βj = sign(β0j )(|β0

j | − λ)+.

3. Non-concave Penalized LS:

Fan and Li (2001) proposed the non-concave penalized least squares

n∑

i=1

(Yi −∑

j

βj Xij)2 +

j

pλ(|βj|),

where the hard threshing penalty function pλ(|θ|) = λ2 − (|θ| − λ)2|(|θ| < λ), which

results in the hard threshing rule βj = β0j I(|β0

j | > λ). Finally, Fan and Li (2001)

proposed the so-called the smoothly clipped absolute deviation (SCAD) model selection

criterion with the penalized function defined as

p′λ(θ) = λ

I(θ ≤ λ) − (a λ− θ)+

(a− 1)λI(θ > λ)

for some a > 2 and θ > 0,

which results in the estimator

βj =

sign(β0j )(|β0

j | − λ)+ when |β0j | ≤ 2λ,

(a− 1)β0j − sign(β0

j ) a λ/(a− 2) when 2λ ≤ |β0

j | ≤ a λ,

β0j when |β0

j | > aλ.

Also, Fan and Li (2001) showed that the SCAD estimator satisfies three properties: (1)

unbiasedness for large true coefficient to avoid unnecessary estimation bias; (2) sparsity

of estimating a small coefficient as 0 to reduce model complexity; and (3) continuity of

resulting estimator to avoid unnecessary variation in model prediction. Fan and Peng

(2004) considered the case that the number of regressors can depend on the sample

size and goes to infinity in a certain rate. To choose the appropriate tuning parameters

a and λ in the SCAD penalty function, the CV or GCV or BIC method can be used;

see Fan and Li (2001), Fan and Peng (204) and Wang, Li and Tsai (2007).

This approach can be generalized to a more general setting, called penalized likelihood

which can be applied to other purposes, say to select copulas in modeling dependent structure

Page 32: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 22

of multivariate financial variables; see Cai, Chen, Fan and Wang (2007),1 and Cai and Wang

(2007).2 Also, it can be applied to quantile regression settings.

2.7 Implementation in R

2.7.1 Classical Models

To fit a multiple regression in R, one can use lm() or glm(); see the followings for details

lm(formula, data, subset, weights, na.action,

method = "qr", model = TRUE, x = FALSE, y = FALSE, qr = TRUE,

singular.ok = TRUE, contrasts = NULL, offset, ...)

glm(formula, family = gaussian, data, weights, subset,

na.action, start = NULL, etastart, mustart,

offset, control = glm.control(...), model = TRUE,

method = "glm.fit", x = FALSE, y = TRUE, contrasts = NULL, ...)

to fit a regression model without intercept, you need to use

fit1=lm(y~-1+x1+...x9)

where fit1 is called the objective function containing all outputs you need. If you want to

model diagnostic checking, you need to use

plot(fit1)

For multivariate data, it is usually a good idea to view the data as a whole using the pairwise

scatter plots generated by the pairs() function:

pairs(data)

To drop or add one variable from or to a regression model, you use the command drop1()

or add1(), for example,

drop1(fit1)

add1(fit1,~x10+x11+...+x20)

1Cai, Z., X. Chen, Y. Fan and X. Wang (2007). Selection of Copulas with Applications in Finance.Working paper.

2Cai, Z. and X. Wang (2007). Selection of Mixture Distributions for Time Series. Working paper.

Page 33: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 23

The last command means that you choose the “best” one from X10 to X20 to add it into

the model. Adding and dropping terms using add1() and drop1() is useful method for

selecting a model when only a few terms are involved, but it can quickly become tedious.

Functions add1() and drop1() are based on the Cp criterion. The step() function provides

an automatic procedure for conducting stepwise model selection. The step() function re-

quires an initial model, often constructed explicitly as an intercept-only model. For example,

suppose that we want to find the “best” model involving X1, · · · , X10, we could create an

intercept-only model and then call step() as follows:

fit0=lm(y~1)

fit2=step(fit0,~x1+x2+...+x10, trace=F)

step(object, scope, scale = 0,

direction = c("both", "backward", "forward"),

trace = 1, keep = NULL, steps = 1000, k = 2, ...)

With “trace=T” or “trace=1”, step() displays the output of each step of the selection

process. The step() function is based on AIC or BIC by specifying k in the function. Also,

one can use the function stepAIC() in the package MASS for a wider range of object

classes.

2.7.2 LASSO Type Methods

The package lasso2 provides many features for solving regression problems while imposing L1

constraints on the estimates and the package lars provides efficient procedures for an entire

LASSO with the cost of a single least squares. LAR is the least angle regression, proposed

by Efron, Hastie, Johnstone and Tibshirani (2004). In lars, you can use the function lars()

lars(x, y, type = c("lasso", "lar", "forward.stagewise"),trace = FALSE,

Gram, eps = .Machine$double.eps, max.steps, use.Gram = TRUE)

and the function cv.lars() to compute the K-fold cross-validated mean squared prediction

error for least angle regressions (lars), LASSO, or forward stagewise.

cv.lars(x, y, K = 10, fraction = seq(from = 0, to = 1, length = 100),

trace = FALSE, plot.it = TRUE, se = TRUE, ...)

Page 34: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 24

In the package lasso2, the function l1ce() is for regression fitting with L1-constraint on the

parameters.

l1ce(formula, data = sys.parent(), weights, subset, na.action,

sweep.out = ~ 1, x = FALSE, y = FALSE,

contrasts = NULL, standardize = TRUE,

trace = FALSE, guess.constrained.coefficients = double(p),

bound = 0.5, absolute.t = FALSE)

or the function gl1ce() for fitting a generalized regression problem while imposing an L1

constraint on the parameters

gl1ce(formula, data = sys.parent(), weights, subset, na.action,

family = gaussian, control = glm.control(...), sweep.out = ~ 1,

x = FALSE, y = TRUE, contrasts = NULL, standardize = TRUE,

guess.constrained.coefficients = double(p), bound = 0.5, ...)

Also, you can use the function gcv() to extracts the generalised cross-validation score(s)

from fitted model objects.

gcv(object, ...)

2.7.3 Example

We begin by introducing several environmental and economic as well as financial time series

to serve as illustrative data for time series methodology. Figure 2.1 shows monthly values of

an environmental series called the Southern Oscillation Index (SOI) and associated recruit-

ment (number of new fish) computed from a model by Pierre Kleiber, Southwest Fisheries

Center, La Jolla, California. This data set is provided by Shumway (2006). Both series are

for a period of 453 months ranging over the years 1950−1987. The SOI measures changes in

air pressure that are related to sea surface temperatures in the central Pacific. The central

Pacific Ocean warms up every three to seven years due to the El Nino effect which has been

blamed, in particular, for foods in the midwestern portions of the U.S.

Page 35: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 25

Both series in Figure 2.1 tend to exhibit repetitive behavior, with regularly repeating

(stochastic) cycles that are easily visible. This periodic behavior is of interest because un-

derlying processes of interest may be regular and the rate or frequency of oscillation char-

acterizing the behavior of the underlying series would help to identify them. One can also

remark that the cycles of the SOI are repeating at a faster rate than those of the recruitment

series. The recruit series also shows several kinds of oscillations, a faster frequency that

seems to repeat about every 12 months and a slower frequency that seems to repeat about

every 50 months. The study of the kinds of cycles and their strengths will be discussed later.

The two series also tend to be somewhat related; it is easy to imagine that somehow the fish

population is dependent on the SOI. Perhaps there is even a lagged relation, with the SOI

signalling changes in the fish population.

The study of the variation in the different kinds of cyclical behavior in a time series can

be aided by computing the power spectrum which shows the variance as a function of the

frequency of oscillation. Comparing the power spectra of the two series would then give

valuable information relating to the relative cycles driving each one. One might also want

to know whether or not the cyclical variations of a particular frequency in one of the series,

say the SOI, are associated with the frequencies in the recruitment series. This would be

measured by computing the correlation as a function of frequency, called the coherence.

The study of systematic periodic variations in time series is called spectral analysis. See

0 100 200 300 400

−1.0

−0.5

0.00.5

1.0

Southern Oscillation Index

0 100 200 300 400

020

4060

80100

Recruit

Figure 2.1: Monthly SOI (left) and simulated recruitment (right) from a model (n=453months, 1950-1987).

Shumway (1988), Shumway (2006), and Shumway and Stoffer (2000) for details.

Page 36: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 26

We will need a characterization for the kind of stability that is exhibited by the environ-

mental and fish series. One can note that the two series seem to oscillate fairly regularly around

central values (0 for SOI and 64 for recruitment). Also, the lengths of the cycles and their

orientations relative to each other do not seem to be changing drastically over the time

histories.

We consider the twelve month moving average aj = 1/12, j = 0, ±1, ±2, ±3, ±4, ±5,

±6 and zero otherwise. The result of applying this filter to the SOI index is shown in Figure

2.2. It is clear that this filter removes some higher oscillations and produces a smoother

0 100 200 300 400

−1.0

−0.5

0.00.5

1.0

0 100 200 300 400

−0.5

0.00.5

Figure 2.2: The SOI series (black solid line) compared with a 12 point moving average (redthicker solid line). The left panel: original data and the right panel: filtered series.

series. In fact, the yearly oscillations have been filtered out (see the right panel in Figure

2.2) and a lower frequency oscillation appears with a cycling rate of about 42 months. This

is the so-called El Nino effect that accounts for all kinds of phenomena. This filtering effect

will be examined further later on spectral analysis since it is extremely important to know

exactly how one is influencing the periodic oscillations by filtering.

In Figure 2.3, we have made a lagged scatterplot of the SOI series at time t+ h against

the SOI series at time t and obtained a high correlation, 0.412, between the series xt+12 and

the series xt shifted by 12 years. Lower order lags at t − 1, t − 2 also show correlation.

The scatterplot shows the direction of the relation which tends to be positive for lags 1,

2, 11, 12, 13, and tends to be negative for lags 6, 7, 8. The scatterplot can also show no

significant nonlinearities to be present. In order to develop a measure for this self correlation

Page 37: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 27

oooo

o ooo

ooo

o

oo

oo

ooo

o oo

oo o

oo

o

oo

oo

o oo

ooo

o

o

o

ooo

o

oo

ooo

oo

o

oo

o

o oo

o

oo

oo

oo

o

o

o

ooo o

ooo

o

oo

oo o

o

o

o

ooooo

o

o o

oo oo

ooo

ooo oooooooo

ooo

o o

oo

o oo

o

oo

o

o

oo

oo

o

o

ooo

o

o

o

oo

oo o

oo

oooo

o oo oo

oo

o

o

o

ooo

o

o oo oooooooo

oo o

oo

oo

o

o

oo

o

ooo

oo

o ooo

o

o oo

ooo

o ooo

o

ooo

o o

oo

o o

o

oo

o

o

o

oo

ooo oo

o

oo

oo

oo

ooo

oo

o

oo

oo

oo

oo

oo

oo

oooo

o oo

oooo

oo

oooo

oo

o oooo

oo

o

o

o oo

o o

ooo

o

oo

oo

o

oo

o

o oo

oo

o oo

o o

o oooo

o

o

o o

o

oo

oo

oo

o

o

o

o o

o ooooooo

o

ooo

oo

o

o

oo

o

oo

o

o

o o

oo

o oo

oo

oo

oo

o o

oo

o oo

o

oo

o

oo

o oo

o

oo

ooooooo

ooo

oo

ooooo oo

o

o oooo

oo o

oo

oo

oo

ooo o

ooo

oo o

ooo

oo

oo

−1.0 0.0 0.5 1.0−1.0

0.01.0

1

ooo

oo ooo

o oo

oo

oo

oo o

oooo

o ooo

o

oo

oo

ooo

ooo

o

o

o

o oo

o

oo

oo o

oo

o

oo

o

ooo

o

oo

oo

oo

o

o

o

o ooo

ooo

o

oo

oo o

o

o

o

o oooo

o

o o

ooo o

ooo

oooo o

oooooo

oooo o

oo

oooo

oo

o

o

oo

oo

o

o

o oo

o

o

o

oo

oo o

oo

o ooo

oooo o

oo

o

o

o

o ooo

o ooo o

oooooo

oo o

oo

oo

o

o

oo

o

o oo

oo

oo oo

o

ooo

oo o

oo ooo

ooo

oo

oo

oo

o

oo

o

o

o

oo

o ooo

oo

oo

oo

oo

o oo

oo

o

o o

oo

oo

oo

oo

oo

oooo

ooo

oo oo

oo

oo oo

oo

o oooo

oo

o

o

ooo

oo

ooo

o

oo

oo

o

oo

o

oo o

oo

ooo

o o

oo ooo

o

o

oo

o

oo

oo

oo

o

o

o

oo

oo oooooo

o

ooo

oo

o

o

oo

o

oo

o

o

o o

oo

o oo

o o

oo

oo

oo

oo

o oo

o

oo

o

oo

oooo

oo

ooooooo

oo o

oo

oooooo o

o

o o ooo

ooo

oo

oo

oo

o ooo

ooo

oo o

oo o

oo

ooo

−1.0 0.0 0.5 1.0−1.0

0.01.0

2

ooooo

ooooo

oo

oo

oo o

ooo

oo o

o o

o

oo

o o

ooo

ooo

o

o

o

oo o

o

oo

ooo

oo

o

oo

o

ooo

o

oo

oo

oo

o

o

o

o ooo

o oo

o

oo

ooo

o

o

o

ooooo

o

oo

oooo

ooo

ooooo

oooooo

ooo

oo

ooo oo

o

oo

o

o

oo

oo

o

o

ooo

o

o

o

oo

ooo

oooo o

o

ooo oo

ooo

o

o

ooo

o

ooo oo oooo

oo

ooo

oo

oo

o

o

o o

o

oo o

ooo oo

o

o

ooo

oo o

ooooo

ooo

oo

ooo o

o

oo

o

o

o

oo

o oooo

o

oo

oo

oo

ooo

oo

o

oo

oo

oo

oo

oo

oo

o ooo

ooo

oo o o

oo

ooo ooo

o oo oo

oo

o

o

ooo

oo

o oo

o

o o

ooo

oo

o

o oo

oooo

ooo

o oo oo

o

o

oo

o

oo

oo

oo

o

o

o

oo

o oo ooooo

o

ooo

ooo

o

oo

o

oo

o

o

oo

oo

ooo

oo

ooo

ooo

oooo

oo

o o

o

oo

ooo

o

oo

o oooooo

ooo

oo

o oooooo

o

oo o oo

ooo

ooo

oo

o

oooo

o oo

ooo

ooo

oo

ooo

o

−1.0 0.0 0.5 1.0−1.0

0.01.0

3

oooo

oo

ooo

oo

oo

ooo

o oo

ooo

o o

o

oooo

ooo

o oo

o

o

o

ooo

o

ooo

o o

oo

o

oo

o

o oo

o

oo

oo

oo

o

o

o

ooo o

oo o

o

oo

oo o

o

o

o

o oo oo

o

oo

oo oo

ooo

oooo

oo ooooo

ooo

oo

oo

ooo

o

oo

o

o

oo

oo

o

o

o oo

o

o

o

oo

oo o

oo

oooo

oooo o

ooo

o

o

o oo

o

ooo o o

o ooooo

ooo

oo

oo

o

o

oo

o

ooo

oooo o

o

o

ooo

ooo

o ooo

o

ooo

oo

oo

oo

o

oo

o

o

o

oo

ooo ooo

oo

oo

oo

o oo

ooo

o o

oo

oo

oo

oo

oo

oo oo

ooo

ooo o

oo

ooooo

o

oo o o o

oo

o

o

o oo

o o

ooo

o

oo

oo

o

oo

o

o o o

oooo

oo o

o o ooo

o

o

oo

o

oo

oo

oo

o

o

o

o o

oo oooooo

o

ooo

oo

o

o

oo

o

oo

o

o

oo

oo

ooo

o o

oo

oo

o o

oo

ooo

o

oo

o

oo

o oo

o

oo

o o ooooo

ooo

oo

oo ooooo

o

ooo oo

ooo

oo

oo

oo

o oo o

oo o

ooo

oo o

oo

oo oo

o

−1.0 0.0 0.5 1.0−1.0

0.01.0

4

oooo

oooo

oo

oo

oo o

o oo

oo o

o o

o

oo

oo

o oo

ooo

o

o

o

ooo

o

oooo o

oo

o

oo

o

ooo

o

oo

oo

oo

o

o

o

ooo o

ooo

o

oo

oooo

o

o

oooo o

o

oo

oo o o

oo o

ooooooo ooo

o

ooo

oo

oo

o oo

o

oo

o

o

oo

oo

o

o

o oo

o

o

o

oo

ooo

oo

o ooo

ooooooo

o

o

o

o oo

o

oooo o

oo oooo

ooo

oo

oo

o

o

o o

o

ooo

oo

ooo o

o

ooo

oo o

o o oo

o

ooo

oo

ooo o

o

oo

o

o

o

oo

o oo o o

o

ooo

oo

oo o

o

oo

o

o o

oo

oo

oo

oo

oo

o ooo

ooo

oo oo

oo

oooo

oo

ooo o o

oo

o

o

ooo

oo

ooo

o

oo

oo

o

oo

o

o o o

oo

ooo

oo

oo o oo

o

o

oo

o

oo

oo

oo

o

o

o

oo

ooo oo ooo

o

ooo

oo

o

o

oo

o

oo

o

o

oo

oo

o oo

o o

oo

oooo

oo

o oo

o

o o

o

oo

ooo

o

oooo o oooo

ooo

oo

ooo oooo

o

o oooo

ooo

oo

ooo

o

o ooooo

o

ooo

oo o

oo

ooo

ooo

−1.0 0.0 0.5 1.0−1.0

0.01.0

5

ooo

oo o

o

oo

oo

oo o

ooo

ooo

oo

o

oo

oo

oo o

ooo

o

o

o

ooo

o

oo

ooo

oo

o

oo

o

o oo

o

oo

oo

oo

o

o

o

o ooo

o oo

o

oo

ooo

o

o

o

o oo oo

o

oo

ooo o

ooo

ooooo

ooo ooo

ooooo

ooo o

oo

oo

o

o

oo

oo

o

o

ooo

o

o

o

oo

oo o

oo

o o oo

o oooo

ooo

o

o

ooo

o

o oooo

o oo ooo

ooo

ooo

oo

o

o o

o

o oo

oo

oooo

o

o ooo

oo

oo oo

o

ooooo

oo

oo

o

oo

o

o

o

oo

oooo o

o

oo

oo

oo

o oo

oo

o

oo

oo

oo

oo

oo

oo

oo oo

ooo

ooo o

oo

oooo

oo

oooo o

oo

o

o

ooo

oo

o oo

o

oo

oo

o

ooo

oo o

oo

o oooo

o oo oo

o

o

oo

o

oo

oo

oo

o

o

o

oo

o ooooo o

o

o

ooo

oo

o

o

oo

o

oo

o

o

o o

oo

o oo

oo

oo

oo

oo

oo

oooo

o o

o

oo

ooo

o

oo

ooo o ooo

ooo

oo

o oooooo

o

oo ooo

ooo

oo

oo

oo

ooo o

o oo

ooo

ooo

oo

ooo

ooo

o

−1.0 0.0 0.5 1.0−1.0

0.01.0

6

ooo

ooo

oo

oo

ooo

o oo

ooo

o o

o

oo

oo

ooo

ooo

o

o

o

ooo

o

oo

ooo

oo

o

oo

o

ooo

o

oo

oo

oo

o

o

o

oooo

o o o

o

oo

oooo

o

o

o ooo o

o

oo

ooooo

oo

o ooo

ooooo o

o

ooooo

oo

ooo

o

oo

o

o

oo

oo

o

o

o oo

o

o

o

oo

ooo

oo

oo oo

ooooo

oo

o

o

o

ooo

o

oooooo o oo

oo

ooo

oo

oo

o

o

oo

o

oo o

ooo oo

o

o

oooooo

o ooo

o

oo o

oo

oo

oo

o

oo

o

o

o

oo

ooo oo

o

ooo

oo

ooo

o

oo

o

oo

oo

oo

oo

oo

oo

o ooo

o oo

oooo

oo

oooo

oo

o oooo

oo

o

o

ooo

o o

ooo

o

o o

oo

o

oo

o

o oo

oo

ooooo

oo ooo

o

o

oo

o

oo

oo

oo

o

o

o

oo

oo ooo oo o

o

ooo

oo

o

o

oo

o

oo

o

o

o o

oo

ooo

oo

oo

oo

oo

oo

ooo

o

oo

o

oo

ooo

o

oo

o ooo o oo

ooo

oo

oo ooo oo

o

ooo oo

oo o

ooo

oo

o

ooo o

ooo

oo o

ooo

oo

oo oo

o o

oo

−1.0 0.0 0.5 1.0−1.0

0.01.0

7

oooo

o

oo

oo

ooo

o oo

oo o

oo

o

oo

o o

ooo

o oo

o

o

o

ooo

o

oo

ooo

oo

o

oo

o

oooo

oo

oo

oo

o

o

o

ooo o

oo o

o

oo

ooo

o

o

o

ooo oo

o

o o

oooo

oo o

oooooooooo

o

ooo

oo

oo

ooo

o

oo

o

o

oo

oo

o

o

ooo

o

o

o

oo

oo o

oo

o ooo

ooo oo

oo

o

o

o

ooo

o

o oo oo

oo o oo o

ooo

ooo

oo

o

oo

o

o oo

oo

oo oo

o

o oo

ooo

oo oo

o

ooo

oo

oo

oo

o

oo

o

o

o

oo

oooo o

o

oo

oo

oo

ooo

oo

o

o o

oo

oo

oo

oo

oo

oo oo

ooo

oooo

oo

oo oo

oo

ooooo

oo

o

o

ooo

oo

ooo

o

oo

oo

o

oo

o

oo o

oo

o oo

oo

ooo oo

o

o

o o

o

oo

oo

oo

o

o

o

o o

ooo ooo oo

o

ooo

ooo

o

oo

o

oo

o

o

oo

oo

ooo

o o

oo

oo

o o

oo

o oo

o

oo

o

oo

o oo

o

oo

oo ooo o o

ooo

oo

ooo ooo o

o

ooooo

oo o

oooo

oo

oooo

o oo

ooo

ooo

oo

ooo

ooo

oo

o

−1.0 0.0 0.5 1.0−1.0

0.01.0

8

ooo

o

oo

oo

ooo

ooo

oo o

oo

o

oo

o o

ooo

ooo

o

o

o

ooo

o

oo

oo o

ooo

oo

o

ooo

o

oo

oo

oo

o

o

o

o ooo

ooo

o

oo

ooo

o

o

o

o oo o o

o

oo

oooo

oo o

ooo oo

oooooo

ooo

oo

oo

oooo

oo

o

o

oo

oo

o

o

o oo

o

o

o

oo

ooo

oo

oo oo

o ooo o

oo

o

o

o

o ooo

o ooo o

ooo ooo

ooo

oooo

o

o

o o

o

o o o

oo

oooo

o

ooo

ooo

oooo

o

ooo

o o

oo

oo

o

oo

o

o

o

ooo ooooo

oo

ooo

ooo

o

oo

o

oo

ooo

o

oo

oo

oo

oooo

o oo

oooo

oo

oo o o

oo

ooo oo

oo

o

o

ooo

oo

o oo

o

oo

ooo

oo

o

ooo

oo

o oo

o o

ooooo

o

o

oo

o

ooo

oo

o

o

o

o

o o

ooooooo

o

o

ooo

ooo

o

oo

o

oo

o

o

o o

oo

ooo

o o

oo

oo

oo

oooo

oo

o o

o

oo

oo oo

oo

ooo ooo o

ooo

oo

ooooooo

o

ooooo

ooo

ooooo

o

o ooo

o oo

ooo

ooo

oo

oo o

ooo

oo

oo

−1.0 0.0 0.5 1.0−1.0

0.01.0

9

ooo

oo

oo

oo ooo

oo

ooo o

o

ooo o

o oo

ooo

o

o

o

o oo

o

oo

ooo

ooo

oo

o

o oo

o

oo

oo

oo

o

o

o

oooo

o oo

o

oo

oo o

o

o

o

oooo o

o

o o

oooo

ooo

o ooo o

oooooo

o oo

oo

oooo

oo

oo

o

o

oo

oo

o

o

ooo

o

o

o

oo

ooo

oo

o ooo

o oooo

ooo

o

o

oooo

ooo oooooo

o o

ooo

ooooo

o

o o

o

oo o

oo

oooo

o

ooo

oo o

oooo

o

oo o

oo

oooo

o

oo

o

o

o

oo

ooooo

o

oo

oo

oo

ooo

oo

o

o o

oo

oo

oo

oo

oo

oooo

ooo

oooo

oo

ooo o

oo

oooo o

oo

o

o

o oo

oo

ooo

o

o o

oo

o

oo

o

ooo

oo

o oo

oo

ooooo

o

o

o o

o

oooo

oo

o

o

o

o o

o oooo oo

o

o

o oo

oo

o

o

oo

o

ooo

o

o o

oo

o oo

oo

oo

oo

o o

oo

ooo

o

oo

o

oo

o oo

o

oo

o ooo ooo

ooo

oo

ooooo oo

o

ooooo

ooo

oo

oooo

oooo

ooo

oo o

oo o

oo

oo o

oo o

oo

ooo

−1.0 0.0 0.5 1.0−1.0

0.01.0

10

oo

oo

oo

ooooo

oo

o oo o

o

oo

oo

o o o

ooo

o

o

o

oo o

o

ooo

oo

oo

o

oo

o

ooo

o

oo

oo

oo

o

o

o

ooo o

oo o

o

oo

ooo

o

o

o

ooo oo

o

oo

oo oo

ooo

o ooo

oooooo

o

ooooo

ooooo

o

oo

o

o

oo

oo

o

o

ooo

o

o

o

oo

oo o

oo

oo oo

ooo ooo

oo

o

o

o oo

o

ooo o o

o oooo o

oo o

oo

ooo

o

oo

o

ooo

oo

o ooo

o

oooo

oo

oooo

o

oo o

oo

oooo

o

oo

o

o

o

oo

o oo oo

o

oo

oo

ooo o

o

oo

o

o o

oo

oo

oo

oo

oo

o ooo

o oo

oo oo

oo

oo oo

oo

ooooo

oo

o

o

o oo

oo

ooo

o

oo

oo

o

oo

o

o oo

oo

ooo

o o

o oooo

o

o

o o

o

ooooo

o

o

o

o

oo

o o oooo o

o

o

oo o

ooo

o

oo

o

oo

o

o

oo

oo

o oo

oo

oo

oo

o o

oo

ooo

o

oo

o

oo

o oo

o

oo

oo ooo oo

oo o

oo

oooooo o

o

o oooo

oo o

oo

oooo

ooo o

ooo

oooo

oo

oo

ooo

ooo

oo

o ooo

−1.0 0.0 0.5 1.0−1.0

0.01.0

11

o

oo

oo

ooo

o ooo

o ooo

o

oo

o o

o o o

ooo

o

o

o

ooo

o

oo

ooo

oo

o

ooo

o oo

o

oo

oo

oo

o

o

o

oooo

ooo

o

oo

ooo

o

o

o

oooo o

o

o o

ooo o

ooo

ooo ooo ooooo

ooo

oo

oo

ooo

o

oo

o

o

oo

oo

o

o

ooo

o

o

o

oo

ooo

oo

o ooo

o oo o o

oo

o

o

o

oooo

oooo o

oo oooo

ooo

ooo

oo

o

oo

o

o oo

oo

oo oo

o

o oooo o

o ooo

o

ooo

o o

oo

oo

o

oo

o

o

o

oo

oooo

oo

oo

oo

oo

ooo

oo

o

o o

oo

oo

oo

oo

oo

oo oo

ooo

ooo oo

o

ooo o

oo

ooooo

oo

o

o

o oo

oo

ooo

o

oo

ooo

ooo

oo o

ooo o

oo o

oo ooo

o

o

oo

o

oo

oooo

o

o

o

oo

o o o ooooo

o

o oo

ooo

o

oo

o

oo

o

o

o o

oo

ooo

oo

oo

oo

oo

oo

o ooo

o o

o

oo

ooo

o

oo

ooo ooo o

oo o

oo

ooooooo

o

oo ooo

ooo

oo

oo

oo

oooo

ooo

oo o

ooo

oo

ooo

oo o

oo

oo ooo

−1.0 0.0 0.5 1.0−1.0

0.01.0

12

oo

oo

oooooo

ooo

o o

o

oo

oo

ooo

o oo

o

o

o

ooo

o

ooo

oo

oo

o

ooo

o oo

o

oo

oo

oo

o

o

o

oooo

o oo

o

oo

oo o

o

o

o

oooooo

o o

oo oo

ooo

ooo o o

oo oooo

ooo

o o

oo

oooo

oo

o

o

oo

oo

o

o

o oo

o

o

o

oo

oo o

oo

oo oo

oooo o

oo

o

o

o

ooo

o

o ooooo oo o

oo

oo o

oooo

o

o

oo

o

o o o

oo

o ooo

o

ooo

ooo

oo ooo

oo o

o o

oo

o o

o

oo

o

o

o

ooo o

o ooo

oo

oo

oo

ooo

oo

o

oo

oo

oo

ooo

o

oo

oooo

ooo

oo oo

oo

oooo

oo

o oooo

oo

o

o

oo o

o o

ooo

o

o o

oo

o

oo

o

ooo

oo

ooo

o o

o oo oo

o

o

o o

o

oo

oooo

o

o

o

o o

oo o oooo

o

o

oo o

ooo

o

oo

o

oo

o

o

oo

oo

o oo

o o

oo

oo

oo

oo

ooo

o

oo

o

oo

oooo

oo

oooo ooo

ooo

oo

ooooooo

o

ooo oo

ooo

oo

oo

oo

oooo

o oo

oo o

oo o

oo

ooo

oo o

oo

ooo oo

o

−1.0 0.0 0.5 1.0−1.0

0.01.0

13

o

oo

oo o

ooo

ooo

o o

o

oo

oo

o oo

o oo

o

o

o

o oo

o

oo

ooo

oo

o

oo

o

oo oo

oo

oo

oo

o

o

o

o ooo

oo o

o

oo

oo o

o

o

o

ooooo

o

oo

ooo o

ooo

oooo

oooo oo

o

ooo

oo

oo

oooo

oo

o

o

oo

oo

o

o

ooo

o

o

o

oo

oo o

oo

oooo

o oo oo

ooo

o

o

ooo

o

oooooo o oooo

oo o

oo

ooo

o

oo

o

oo o

oo

o o oo

o

ooo

ooo

o oooo

ooo

oo

oo

oo

o

oo

o

o

o

oo

oooo o

o

oo

oo

oo

o oo

ooo

oo

oo

oo

oo

oo

oo

o ooo

ooo

ooo o

oo

oooo

oo

o oooo

oo

o

o

ooo

o o

ooo

o

oo

oo

o

oo

o

ooo

oo

ooo

oo

o o ooo

o

o

oo

o

oo

oo

oo

o

o

o

oo

ooo oo oo

o

o

ooo

oo

o

o

oo

o

oo

o

o

oo

oo

o oo

o o

oo

oo

o o

oo

o oo

o

oo

o

oo

o oo

o

oo

o oooo oo

ooo

oo

ooooooo

o

o oooo

ooo

oo

oo

oo

oooo

ooo

ooo

ooo

oo

ooo

ooo

ooo ooo o

o

o

−1.0 0.0 0.5 1.0−1.0

0.01.0

14o

o

ooo

ooo

ooo

oo

o

oo

o o

ooo

o oo

o

o

o

oo o

o

oo

ooo

oo

o

oo

o

oooo

oo

oo

oo

o

o

o

oooo

ooo

o

oo

ooo

o

o

o

o oooo

o

o o

oo oo

oo o

oooo

oo ooo o

o

ooooo

ooooo

o

oo

o

o

oo

oo

o

o

o oo

o

o

o

oo

ooo

oo

o ooo

oooo o

oo

o

o

o

ooo

o

o oo oooo o oo o

ooo

ooo

oo

o

oo

o

ooo

oo

oo oo

o

oooo

oo

oo oo

o

ooo

o o

oo

oo

o

oo

o

o

o

ooo o

o ooo

oo

oo

oo

o oo

oo

o

oo

oo

oo

oo

oo

oo

oo oo

o oo

oo oo

oo

oooo

oo

oo o oo

oo

o

o

ooo

o o

o oo

o

oo

oo

o

ooo

o oo

oo

oooo o

o o o oo

o

o

oo

o

oo

oo

oo

o

o

o

oo

o oooo o o

o

o

o oo

ooo

o

oo

o

ooo

o

oo

oo

ooo

oo

oo

oo

o o

oo

o oo

o

oo

o

oo

oooo

oo

oo oooo o

oo o

oo

o oooooo

o

oo ooo

ooo

oo

oo

oo

o ooo

ooo

ooo

oo o

oo

oo o

ooo

oo

oo ooo

o

oo

−1.0 0.0 0.5 1.0−1.0

0.01.0

15

o

ooo

o ooo

o ooo

o

oo

o o

ooo

ooo

o

o

o

ooo

o

ooo

oo

ooo

oo

o

oooo

oo

oo

oo

o

o

o

ooo o

ooo

o

oo

ooo

o

o

o

ooooo

o

oo

oo o o

ooo

oooooo o ooo

o

ooo

oo

oo

ooo

o

oo

o

o

oo

oo

o

o

ooo

o

o

o

oo

oo o

oo

oo oo

o oo oo

oo

o

o

o

o ooo

oooo o

ooo ooo

ooo

oo

oo

o

o

oo

o

ooo

oo

oooo

o

o oo

oo o

oooo

o

ooo

oo

oo

o o

o

oo

o

o

o

oo

oooo

oo

oo

oo

oo

ooo

ooo

oo

oo

oo

oo

oo

oo

o ooo

ooo

ooo o

oo

oooo

oo

o oo o o

oo

o

o

o oo

oo

o oo

o

oo

oo

o

oo

o

oo o

oo

o oo

oo

oo o oo

o

o

oo

o

oo

oo

oo

o

o

o

oo

oo ooo o o

o

o

oo o

oo

o

o

oo

o

ooo

o

oo

oo

o oo

o o

oo

oo

oo

oo

ooo

o

o o

o

oo

ooo

o

oo

o oo oooo

ooo

oo

o o ooooo

o

ooo oo

ooo

ooo

oo

o

o oooooo

ooo

oo o

oo

ooo

ooo

oo

o oo oo

o

oo

o

−1.0 0.0 0.5 1.0−1.0

0.01.0

16

Figure 2.3: Multiple lagged scatterplots showing the relationship between SOI and thepresent (xt) versus the lagged values (xt+h) at lags 1 ≤ h ≤ 16.

or autocorrelation, we utilize a sample version of the scaled autocovariance function, say

ρx(h) = γx(h)/γx(0),

where

γx(h) =1

n

n−h∑

t=1

(xt+h − x)(xt − x),

which is the sample counterpart with x =∑n

t=1 xt/n. Under the assumption that the

underlying process xt is white noise, the approximate standard error of the sample ACF is

σρ =1√n. (2.9)

That is, ρx(h) is approximately normal with mean 0 and variance 1/n.

As an illustration, consider the autocorrelation functions computed for the environmental

and recruitment series shown in the top two panels of Figure 2.4. Both of the autocorrelation

functions show some evidence of periodic repetition. The ACF of SOI seems to repeat at

periods of 12 while the recruitment has a dominant period that repeats at about 12 to 16

time points. Again, the maximum values are well above two standard errors shown as dotted

lines above and below the horizontal axis.

Page 38: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 28

0 10 20 30 40 50−0.5

0.00.5

1.0ACF of SOI Index

0 10 20 30 40 50−0.5

0.00.5

1.0

ACF of Recruits

−40 −20 0 20 40−0.5

0.00.5

1.0

CCF of SOI and Recruits

Figure 2.4: Autocorrelation functions of SOI and recruitment and cross correlation functionbetween SOI and recruitment.

In order to examine this possibility, consider the lagged scatterplot matrix shown in

Figures 2.5 and 2.6. Figure 2.5 plots the SOI at time t + h, xt+h, versus the recruitment

ooooo oooo oooo o oo oo

oo

oo o o ooo

oo

oooo ooo

ooooo ooo

o

oo

o ooo

o ooo

oo oo

ooo

o

oo

ooo

o

ooo

oo

ooo

ooo o o

o oo

oooooo o oo

o oooooooo oooo

ooooooo o o

ooo oo ooo

oo

oo

o

o oo ooo

oo

oo oo

o o

oo oo

ooo oo

oo

oo oo

o ooo

o oo

o

ooooo

oooo o o

oo oo

o

oooo ooo

o

ooooooo o

oo o

oo oo

oooo

oo o

o

o

o oo oooo oo o o

oo ooo

oo

o

ooo

ooo o oo ooo oooo

ooo

oooooo

ooo

o ooooo

oo

ooo

oo ooooo

o ooo ooo o

ooo oooo oo

o

o

ooooo

oo oo

o oo ooo

oo

oo ooo o

o oo

oo ooo oo o

o

ooooo

oo

ooo o

o

o

oo ooo

ooo ooo

o oo oooo

o

oo ooo o oo ooo

oo

oo

o

oo o

oo

ooooooo ooo oo

ooo

oo

oo

oooooo

ooo

oo

oooo

oo oo

oooo oo ooo

ooo

−1.0 0.0 0.5 1.0

020

6010

0

1

oooo ooo

o oooo o oo o oo

oo

oo o oooo

oo

ooo ooooo

ooo ooooo

oo

oooo

oo oo

ooooooo

o

oo

ooo

o

ooooo

ooo

oo o oo

ooo

ooooo o ooo oooooooo oooo

ooooooo

o ooo

o oo ooooo

o

oo

o

ooooo o

o o

o ooo

oo

oooo

oo oo oo

o

o o oo

oooo

o oo

o

oo

ooooooo ooo o o

o

o

ooo

oooo

o

o ooo

oo ooo

ooo oo

o

ooooo

ooo

o

oo oooo ooo ooo ooo

oo

o

o

oo ooo o oo ooo oooo

oo oo

oooooo

oo

ooooooo

oooo o

ooooooo

ooo ooo oo

oo oooooo o

o

o

o ooo

o

ooo o

oo oooo

oo

oooo o o

ooo

o ooo ooooo

oooo

oo

ooo

ooo

o

o o ooo

o oooo ooo

oooo oo

o ooo o oo oo

ooo

oo o

o

o o o

oo

oooooo ooo

ooooo

o

oo

oo

oooo

oooo

oo o

oo o

o

oooo

oo ooo ooo

ooo

o

−1.0 0.0 0.5 1.0

020

6010

0

2

ooo oooo

oooo o oo o ooo

oo

oo ooooo

oo

oo ooooo

ooo oooo o

o

oo

ooo o

o ooo

oooo

o oo

o

oo

o oo

o

o oo

oo

ooo

oo oo o

o oo

oooo o

ooo oooooooo ooooo

oooooo

o oo oo

oo oooo oo

o

oo

o

o ooo oo

oo

oo oo

oo

oooo

ooo oo

oo

o ooo

oooo

oo o

o

ooooooo o

oooo oo

o

o

ooo

ooo o

o

ooo

oo oo o

ooo

oooo

oooo

oo o

o

o

o oooo oo oooo

ooooo

oo

o

ooo

o o oo ooo oooo oo

ooo

ooooo o

oo

oooooo o

ooo

o ooooooo o

oo ooo ooo

o oooo oo o o

o

o

oooo

o

oo o o

o ooooo

oo

ooo o o o

ooo

ooo oooooo

oooo

oo

oo

oo o

o

o

o ooooo o

oo o oo o

ooo ooo

oooo oo o

ooo o

oo

oo

o

o oo

oo

ooooo ooo

ooooooo

oo

ooooo

oo o

oo

oo o

ooo

o

oooo

o o oo ooo oo

o oo

−1.0 0.0 0.5 1.0

020

6010

0

3

oo ooooo

ooo o oo o ooooo

o

oooooo o

ooo

oooooooo oooo o o

o

oo

oooo

oooo

oooooo

oo

oo

o ooo

ooooo

o oo

ooo oo

ooo

ooo o o

oo oooooooo oooooo

ooooo o oo oo

oo oooo ooo

o

oo

o

ooo oo o

o o

o o oo

o o

oooooo oooo

o

oo oooo o

oo o

o

o

oooo

oo o ooo o

oo oo

o

o oooo oo

o

ooo

ooo o oo

o oooo

o

ooooo

ooo

o

oooo oo o ooo o

ooooo

oo

o

ooo

o oo ooo oooo ooo

o oo

oooo oo

oo

ooo

oo oo

oo

oo o

ooooo oo

o ooo oooo

oooo ooo o o

o

o

ooooo

oo oo

oooooo

oo

oo o o oo

ooo

oo oo ooooo

oooo

ooo

oo

o oo

o

oooo o

ooo o oo

oooo ooo

o

oo ooo oo

oooo

oooo

o

ooo

oooooo

ooo oooooo oo

oo

oooo

oo

ooo

oo

oo

ooo

o

oooo

o ooooo oo

ooo

o

−1.0 0.0 0.5 1.0

020

6010

0

4

o oooo oooo o oo o ooo

oo

oo

ooooo oo

oo

oooooooo

oooo o ooo

oo

o oo o

oo oo

ooo o

ooo

o

oo

ooo

o

ooooo

ooo

oo oo o

oooo

o o ooo oooooooo ooooooo

oooo o oo

oo oo

oooo oo oo

o

oo

o

oooo oooo

o ooo

oo

ooo o

ooooo o

o

o ooo

o o oo

ooo

o

oooooo ooo o o

o ooo

o

oooo oo o

o

ooo

oo o ooo

ooooo

o

oo oo

oo o

o

o

ooo oo o ooo oo

ooo oo

oo

o

oo o

oo ooo oooo oo oo

ooo

ooo oo o

oooooo ooo

oo

ooo

oooo ooo

ooo oooo o

ooooo o

o ooo

o

ooo oo

ooo o

oooooo

oo

oo o ooo

o oo

o oo oooooo

oooo

oo

oo

ooo

o

o

oooo o

ooo oo o

ooo ooo o

o

o o oo ooo

o ooo

oo

o o

o

ooo

oo

ooo o

oo oooooo

ooo

ooooo

oo

ooo

oo

oo o

oo o

o

ooo o

oo ooo ooo

oo oo

−1.0 0.0 0.5 1.0

020

6010

0

5

oooo ooo

o o oo o ooo ooo

oo

oooo ooo

oo

ooooooo o

ooo o oo oo

oo

oooo

o ooo

oo oo

o oo

o

oo

oo oo

ooo

o o

o oo

ooo oo

ooo

oo ooo

oooooooo ooooooooo

ooo oo

oo ooo

ooo oo ooo

o

oo

o

o oo oo o

o o

oooo

oo

oo oo

oooo o o

o

ooooo oo

oooo

o

oo

ooo ooo

o ooooo

o

o

ooo

oo oo

o

ooo

oo ooo

ooo

oooo

o ooo

ooo

o

o

oo oo o oooooo

oo o oo

oo

o

oo oo ooo oooo oo oo

ooo

o

oo oo o o

ooo

oo oooo

oo

ooo

ooo ooo o

oo oooo oo

oo oo o ooo o

o

o

oooo

o

oo oo

oooooo

oo

oo oooooo

o

oooooo

ooo

ooooo

oo

oo

o oo

o

oo ooo

o ooo oo

ooooo oo

o

o oooooo

ooo o

oo

o o

o

ooo

oooo oo

o oooooo ooo

o

oooo

oo

oo

ooo

oo

oo

ooo

o

oo o o

o ooo ooo o

ooo

o

−1.0 0.0 0.5 1.0

020

6010

0

6

ooo oooo

o oo o ooo ooo

oo

o

ooo oooo

ooo

ooooo oooo o oo oo

o

oo

o ooo

oooo

oooo

ooo

o

ooo o

oo

ooo

oo

o oo

oo ooooo

oo

ooo oooooo

oo ooooooooooo o oo oo oo o

ooo oo oo o

oo

oo

o

oooo oo

o o

oo oo

oo

ooo o

ooo o ooo

oooo

oo oooo

o

o

ooo o ooo o

oo oooo

o

o

ooo

o ooo

o

o oo

oooo o

ooo

oooo

oo oo

oo oo

o

o oo o ooooooo

o o o oo

oo

o

ooo

ooo oooo oo oo oo

ooo

ooo o oo

oo

oo oooo o

ooo

ooo

o ooo ooo oooo ooo

o oo o o oo oo

o

o

o oo o

o

oooooooo o

o

oo

ooooo o

ooo

o ooooooo

o

oooo

oo

oo

oo o

o

o

o o ooo

o oo oooo o

oo oooo

oo oooo o

oooo

oo

oo

o

ooo

oo

o ooooooo

oo ooo o

o

ooo

oo

oo

oo o

oo

ooo

ooo

o

oo oo

oooooo oo

ooo

o

−1.0 0.0 0.5 1.0

020

6010

0

7

oo ooooo

oo o ooo oo o oo

oo

oo oooo o

oo

ooooo ooo

o o oo oooo

oo

ooo o

oooo

ooo o

ooo

o

oo

oooo

ooo

o o

ooo

ooooo

ooo

ooo oo

oooooo oooooooooo

oo oo oo oo oo

oo oo oo o o

oo

oo

o

o oo oo o

oo

o ooo

oo

oo oo

oo o oo o

o

oo ooo oo

oooo

o

oo

o ooo o oo oo

oo oo

o

o oo

oooo

o

ooo

ooo oo

ooo

oooo

o oooo

ooo

o

oo o ooo ooooo

o o ooo

oo

o

oo o

oo oooo oo oo ooo

ooo

oo o ooo

oo

ooooo o o

oo

ooo

oooo ooooooo oooo

oo o o ooooo

o

o

ooo o

o

oooo

ooo ooo

oo

oooo oo

o oo

oooooooo

o

oooo

oo

oo

ooo

o

o

o ooo o

oooooo

ooo ooo o

o

o oooo oo

o oooo

ooo

o

ooo

oo

ooo ooooo

o oooo o

o

oo

oo

oo

oo

o oo

ooo o

ooo

o

ooo ooo ooo oo o

oo o

o

−1.0 0.0 0.5 1.0

020

6010

0

8

o oooo oo

o o ooo oo o o oo

oo

ooooo oo

ooo

ooo oooo

o oo ooooo

oo

oooooooo

oo oo

o oo

o

oooo

oo

o oo

o o

o oo

ooooo

o ooo

o oooooooo oooooooooooo

oooo oo

oooo

oo oo o ooo

o

oo

o

oooo o o

oo

oooo

o o

oooo

oo oo oo

o

o o oo

ooooooo

o

oo

ooo o ooooo

o ooo

o

ooooooo

o

o oo

oo oooo

oooo o

o

oo oo

ooo

o

o

o o ooooo

ooo o

o oo ooo

o

o

ooo

o oooo oo oo oooooo

o

oo oooo

oo

oooo o o o

ooo

ooo

oo ooo oooo oooo o

o o o oo oooo

o

o

o ooo

o

oooo

oo oooo

oo

ooo ooo

ooo

ooooo

ooo

o

oo oo

oo

oo

ooo

o

o

oooo o

o oooo o

ooooo o oo

oooo ooo

ooo o

oo

oo

o

ooo

oo

oooo

oooo ooo oooo

oo

ooo

oo

oooo

oo

oo

ooo

o

oo oo

o ooo oo ooo

o oo

−1.0 0.0 0.5 1.0

020

6010

0

9

oooo o oo

o ooo oo o o oooo

o

oooo ooo

oo

ooo oooo

ooo oooo o

o

oo

o ooooo o

oo

oooo o

oo

oo

oooo

ooo

oo

ooo

ooooo

o oo

ooooo

oooo ooooooooooo o

oo oo oo

ooooo

o oo o oo oo

o

oo

o

o oo o oo

o o

oooo

oo

ooooo

oo oooo

o oo oooo

oooo

o

oo

oo o oo oooo

oooo

o

o ooooo o

o

o oo

ooooo

ooo

o ooo

o ooo

ooo

o

o

o ooo ooooo o o

oo ooo

oo

o

ooo

oooo oo oo ooooo

o oo

oooooo

oo

ooo

o o oo

oo

oo oo

o ooo oooo oooo oo

o o oo ooooo

o

o

o oo o

o

oooo

o ooo oo

oo

oo ooo oo o

o

ooooo

ooo

o

o ooo

oo

oo

oooo

o

oo ooo

oooo oo

o ooo o oo

o

oooooo oooo o

oo

oo

o

ooo

ooo ooo

ooo ooo o o

ooo

oo

oo

oo

oo

o ooo

ooo

oo o

o

oooo

ooooo ooo

oo oo

−1.0 0.0 0.5 1.0

020

6010

0

10

ooo o ooo

ooo oo o o oooo

oo

ooo oooo

oo

oo oooo o o

o oooo ooo

oo

oooo

o ooo

ooo o

ooo

o

oooooo

o ooo o

o oo

oooo o

ooo

ooooo

ooo ooooooooooo o o

ooo oo oooo o

ooo o oo oo

oo

oo

o

ooo ooo

oo

oooo

o o

ooo o

oo ooo

oo

oo ooooo

ooo

o

o

oo

o o oo oooo o

ooo

o

o

ooo

oo oo

o

ooo

ooooo

ooo

oo oo

oo oo

oo o

o

o

ooo ooooo

o o oo ooo

oo

o

o

oo oooo oo oo ooooo

ooo

o

oooooo

oooo o o ooo

oo

ooo

oooo oooo oooo oo o

o oo ooooo oo

o

ooooo

oooo

ooo o oo

oo

oooo ooooo

ooooooooo

oo oo

oo

oo

oo o

o

o

o o oo o

ooo ooo

ooo o oo o

o

oo ooo oo

o ooo

oo

oo

o

oo o

oo

oooooo oo

o o oooo

o

ooo

oo

oo

ooo

ooo

oo

oo o

o

ooo ooo o

o ooo oo

ooo

−1.0 0.0 0.5 1.0

020

6010

0

11

oo o oo oo

oo oo o o ooooo

oo

oo ooooo

oo

ooooo o oo

oooo oo oo

oo

ooooooo

oo

o o ooo

oo

ooooo

o

o oo

oo

ooo

ooo o o

oooo

oooooo ooooooooooo o ooo

o oo oooo ooo

o o oo oooo

o

oo

o

o oooo o

oo

oo oo

oo

oo o o

ooooo o

o

o ooooooo

oo o

o

oo

o oo oooo oo

ooo

o

o

ooo

o oo o

o

oooooooo

oo o

o ooo

o ooo

oooo

o

oo ooooo oo ooooo o

oo

o

o

ooo

oo oo oo oooooooo o

o

ooooo o

ooo

o o oooo

oo

oooooo oooooooo oo o o

oo ooooo oo

o

o

o oooo

ooo ooo o o o

o

oo

ooo oo o

ooo

oooooo

ooo

o o oo

oo

oo

oo o

o

o

o oooo

ooooo o

ooo oo oo

o

o ooo ooo

o ooo

oooo

o

o oo

oo

ooooo ooo o ooo

ooo

oo

oo

oo

oo

oooo

ooo

ooo

o

oo oo

o ooooo o o

oooo

−1.0 0.0 0.5 1.0

020

6010

0

12

o o oo o oo

o oo o o ooooo

oo

o

ooooooo

oo

oooo o oo o

ooo oo ooo

oo

ooo o

oo oo

oo oo

o oo

o

oo

o oo

o

ooo

o o

ooo

oo o oo

o oo

ooooo

o ooooooooooo o oo o

ooo

oooo oo oo

o oo ooo oo

o

oo

o

o ooo oo

oo

o ooo

oo

oo ooo

ooo o oo

ooooooo

oo o

o

o

oo

oo ooooooo

o ooo

o

ooo

oo o o

o

o ooo

ooooo

oooo o

o

oooo

oo o

o

o

o ooooo o ooo o

oo o oo

oo

o

ooo

o oo oo oooooo oo

o oo

oooo oo

oo

oo ooooo

oo

oo o

oo oooo o

ooo oo o o o

o ooooo

oo oo

o

oooo

o

oo oo

o o o ooo

oo

oo oo oo

ooo

oooooo

o oo

o ooo

oo

oo

ooo

o

o

oo ooo

o ooo ooo o

oo oooo

oooooo o

oooo

oo

oo

o

ooo

oo

ooooooo o oooo

ooo

oo

oo

oo

ooo o

ooo

o o

oo oo

oooo

oo ooo o o o

ooo

o

−1.0 0.0 0.5 1.0

020

6010

0

13

o oo o ooo

oo o o ooooo oo

oo

ooooooo

oo

ooo o oo oo

oo oo oooo

oo

oooo

o ooo

oooo oo

oo

oo

ooo

o

o oo

oo

ooo

oo ooo

ooo

ooooo

ooooooooooo o oo oo

oo oooo oo ooo

oo ooo ooo

o

oo

o

ooo ooo

oo

oo oo

oo

ooo o

ooo o oo

o

ooooooo

oo o

o

o

oo

o oooo oooo

oo oo

o

ooo

o o oo

o

ooo

ooooo

oo o

o ooo

oooo

oo oo

o

ooooo o o oo ooo o oo

oo

o

o

ooo

oo oo oooooo ooo

ooo

ooo oooo

oo

oooooo

oo

ooo

ooooo oo

oo oo o o oo

ooooo oo o o

o

o

oooo

o

oooo

o o oooo

oo

ooo ooo

ooo

oooooooo

o

oo oo

oo

oo

ooo

o

o

o oooo

ooo ooo

o oo oooo

o

oo ooo o o

oooo

oooo

o

oo o

oo

ooo ooo o oo

oooo o

o

oo

oo

oo

oo

oooo

oo o

ooo

o

ooo o

o ooo o o oo

ooo

o

−1.0 0.0 0.5 1.0

020

6010

0

14

oo o ooooo o o ooooo oo

oo

o

ooooooo

oooo o oo oooo oo ooo o

o

oo

o ooooooo

ooo o

oooo

oo

o ooo

ooo

oo

ooo

oooo ooo

ooooo o

oooooooooo o oo oo oo

oooo oo oo oo

o ooo oo oo

o

oo

o

oooooo

o o

o ooo

o o

oo oo

oo o oo

oo

oooooooo

ooo

o

oo

oooo oooo o

o ooo

o

o oo

o ooo

o

ooo

oooo o

ooo

oo oo

oo oo

ooo

o

o

oooo o o oooooo oo o

oo

o

o

oo o

o oo oooooo oo oo

ooo

oo oooo

oo

oooooo o

oo

ooo

oooo ooo

o oo o o oo o

oooo ooo oo

o

o

oooo

o

ooo o

o ooooo

oo

oo oooo

ooo

oooooo

o oo

o o oo

oo

oo

oo o

o

o

oooo o

ooooo ooo

oooo oo

o ooo o oo

oooo

oo

o o

o

o oo

oooo oo

o o oooooo

ooo

oo

oo

ooo

ooo

oo

ooo

ooo

o

oo oo

oooo o ooo

ooo

o

−1.0 0.0 0.5 1.0

020

6010

0

15

o o ooo oo

o o ooooo oooo

oo

oooooo o

ooo

o oo oooo

oo ooo ooo

oo

ooo ooo o

ooo oooo

oo

oo

o oo

o

o oo

oo

ooo

ooo oo

oooo

oo ooooooooooo o oo oo oo

oooo oo oo o o

oooo oo oo

oo

oo

o

o ooooooo

oooo

o o

oooo

oo oo oo

o

oooooo o

ooo

o

o

oo

ooo oooo oo

ooo

o

o

ooo

ooo o

o

ooo

ooo oo

oo o

o ooo

o ooo

ooo

o

o

ooo o o oo ooo o

oo ooo

oo

o

ooooo ooooo

o oo o oooo

o

ooooo oo

oo

oooo oo

ooo

o oo

oo oooooo o o oo oo

ooo oo ooo o

o

o

ooooo

oo o o

oooo oo

oo

oooooo

ooo

oooo ooo oo

o ooo

oo

oo

oo o

o

o

oooooo o

oo o oo oooo oo

o

oooo ooo

ooooo

ooo

o

ooo

oo

o oooo oooo

oo ooo

o

oo

oo

oo

oo

ooo

ooo o

oo oo

ooo ooo o

o ooooo

ooo

−1.0 0.0 0.5 1.0

020

6010

0

16

Figure 2.5: Multiple lagged scatterplots showing the relationship between the SOI at timet+ h, say xt+h (x-axis) versus recruits at time t, say yt (y-axis), 0 ≤ h ≤ 15.

Page 39: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 29

series yt at lag 0 ≤ h ≤ 15 in Figure 2.5. There are no particularly strong linear relations

apparent in this plots, i.e. future values of SOI are not related to current recruitment. This

means that the temperatures are not responding to past recruitment. In Figure 2.6, the

current SOI values, xt are plotted against the future recruitment values, yt+h for 0 ≤ h ≤ 15.

It is clear from Figure 2.6 that the series are correlated negatively for lags h = 5, . . . , 9. The

ooooo oooo oooo o oo oo

oo

oo o o ooo

oo

oooo ooo

ooooo ooo

o

oo

o ooo

o ooo

oo oo

ooo

o

oo

ooo

o

ooo

oo

ooo

ooo o o

o oo

oooooo o oo

o oooooooo oooo

ooooooo o o

ooo oo ooo

oo

oo

o

o oo ooo

oo

oo oo

o o

oo oo

ooo oo

oo

oo oo

o ooo

o oo

o

ooooo

oooo o o

oo oo

o

oooo ooo

o

ooooooo o

oo o

oo oo

oooo

oo o

o

o

o oo oooo oo o o

oo ooo

oo

o

ooo

ooo o oo ooo oooo

ooo

oooooo

ooo

o ooooo

oo

ooo

oo ooooo

o ooo ooo o

ooo oooo oo

o

o

ooooo

oo oo

o oo ooo

oo

oo ooo o

o oo

oo ooo oo o

o

ooooo

oo

ooo o

o

o

oo ooo

ooo ooo

o oo oooo

o

oo ooo o oo ooo

oo

oo

o

oo o

oo

ooooooo ooo oo

ooo

oo

oo

oooooo

ooo

oo

oooo

oo oo

oooo oo ooo

ooo

−1.0 0.0 0.5 1.0

020

6010

0

1

oooooo

ooo oooo o ooo

oo

ooo o o oo

oo

ooooo oooooooo oo

o

oo

oooo

oo oo

ooo o

ooo

o

oo

o oo

o

o ooo o

ooo

oo oo o

ooo

ooooo

oo o ooo oooooooo oo

ooooooooo o

oo oo oo oo

oo

oo

o

o ooo oo

o o

o ooo

o o

ooo o

oooo oo

o

oooo

oo ooooo

o

oooo

oooooo ooooo

o

o oo

oo oo

o

o oo

ooooo

oo o

oooo

oooo

ooo

o

o

oo oo oooooo o

ooo oo

oo

o

oo o

o ooo o oo ooo ooo

o oo

oo oooo

oo

oo o oooo

oo

oooo

o o oooooo ooo ooo

oooo oooo o

o

o

o oo o

o

ooo oo o oo o

o

oo

ooo ooo

o oo

ooo ooooo

o

oooooo

oo

ooo

o

o

o ooo o

ooo o oo

o ooo ooo

o

oooooo ooo

ooo

ooo

o

ooo

oo

oooooooo ooo o

ooo

oo

oo

oo

oooo

oo

ooo

oo o

o

ooo o

oooo o oo o

oo oo

o

−1.0 0.0 0.5 1.0

020

6010

0

2

ooooo

oooo oooo o oo

oo

oo oo o o o

ooo

o oooo oooooooo o

o

oo

o oo oooo

oo

oooooo

o

oo

ooo

o

o oo

oo

ooo

ooo oo

o oo

oo oooooo o oo

o oooooooo oo

ooooooooo

ooo oo oo o

oo

oo

o

ooo oo o

oo

oo oo

oo

oooo

ooooo oo

ooooo ooo

ooo

o

oo

oooooo

oooo ooo

o

ooo

ooo o

o

ooo

ooooo

ooo

o ooo

oooo

ooo

o

o

o oo ooooo

o ooo ooo

oo

o

o

oo o

oo ooo o oo ooo oo

ooo

ooo ooo

oo

ooo

o ooo

oo

ooo

oo o o ooo

ooo ooo oo

o oooo o

oooo

o

o ooo

o

oooo

oo o ooo

oo

oooo oo

o oo

oooo ooo o

o

oooo

ooo

oo

ooo

o

o o oo o

oooo o o

ooo oo oo

o

o ooo ooo

o oo oo

oo o

o

o oo

oo

oooooooo

o ooooo

o

oo

oo

oo

oooo

oo

ooo

oo o

o

ooooooo

oo o ooo

ooo

oo

−1.0 0.0 0.5 1.0

020

6010

0

3

oooo

o oooo oooo oo

oo

ooo oo o o

oo

ooo oooo

oooooooo

o

ooo o

ooooo

oo

o ooo o

oo

oo

ooo

o

ooo

oo

o oo

oooo o

o ooo

oo oooooo

o ooo oooooooooooooooooo

oo oo oo oo

oo

oo

o

o oo o oo

oo

o ooo

o o

oo ooo

oooooo

o oooo o o

ooo

o

o

ooo oo

oooooo

o o oo

o

o oo

oooo

o

ooo

oo oooo

o oo o o

o

o ooo

ooo

o

o

oo oo oo oooo o

o o ooo

oo

o

oo o

o oo ooo o oo oooo

ooo

oo oo oo

ooo

o oo o oo

oo

oo o

ooo o o oo

oooo ooo o

oo ooooooo

o

o

o oo oo

oooo

o oo o oo

oo

ooooo o

ooo

o oooo oooo

o oooo

ooo

ooo

o

o

oo ooo

o oooo o

ooo o oo o

o

oo ooo ooo o

ooo

ooo

o

oo o

oo

o oooooooo

o ooo o

o

ooo

oo

oo

ooooo

oo o

ooo

o

oo oo

o ooooo o o

ooo

ooo

o

−1.0 0.0 0.5 1.0

020

6010

0

4

ooo

oo oooo oooo

oo

o

oooo oo o

oo

oooo ooooooooooo

o

oooo

o oo oo

oo

oo ooo

oo

oo

o oo

o

ooo

o o

ooo

ooooooo

ooo oo o

ooooo o ooo ooooooo

ooo

ooooooo

oo o oo oo o

oo

oo

o

oooo o o

o o

oo oo

oo

oo o o

oo ooo

oo

oo oooo o

oo o

o

o

oo

oooooo

ooooo

oo

o

o oo

o ooo

o

oooo

oo ooo

oooo o

o

oo ooo

ooo

o

o oo oo ooo

ooooo o ooo

o

o

ooo

o o oo oooo oo oo

ooo

o

ooo oo oo

oooo oo o o

ooo

ooo

ooo o o oooooo ooo

ooo oooo oo

o

o

ooo o

o

oooo

oo oo oo

oo

oooooo

ooo

o o oooooo

o

oo oooo

ooo

ooo

o

o ooo oo o

ooooo o

oo o ooo

oooooo o

ooo o

oo

oo

o

ooo

oo

o o oooooo

ooo ooo

o

oo

oo

oo

oo

o oooo

oo

ooo

o

ooo o

oo ooooo o

oo oo

o o

oo

−1.0 0.0 0.5 1.0

020

6010

0

5

ooooo oooo ooo

oo

o

oo ooo oo

oo

ooooo ooo

o ooooooo

oo

ooo o

oo oo

oo oo

ooo

o

oo

ooo

o

o oo

o o

ooo

ooooo

o oo

ooo oo

oooooo o ooo oooooo

oo ooooo

oooo

oo o oo oooo

oo

o

o oo oo o

oo

oooo

o o

ooo o

ooo ooo

o

o oo ooooo

ooo

o

oo

o ooooooooooo o

o

ooo

oo oo

o

o oooo oo o

ooo

o ooo

ooooooo

o

o

oo oo oo ooooo

o oo oo

oo

o

ooo

o o o oo ooo o oo oo

o oo

oo oo oo

oo

ooo

o oo o

ooooo

ooooo o o

oooooo oo

o ooo oo

oo oo

o

o oo o

o

oo oo

ooo ooo

oo

oooooo

o oo

o o o oooo oo

o oooooo

oooo

o

o

oo oo o

ooo ooo

o oooo o o

o

oooo ooo

ooo o

oo

oo

o

o oo

oooo o o

oooooooo

ooo

oo

ooo

oo

oo o

ooooo

ooo

o

oo oo

oooooooo

ooo

ooo

oo

o

−1.0 0.0 0.5 1.0

020

6010

0

6

ooooo oooo oo

oo

o

oo o ooo o

oo

oooooo oo

oo oooooo

oo

oooo

o ooo

ooo o

o oo

o

oo

ooo

o

ooo

oo

o oo

ooooo

ooo

oo oo o

o ooooo

o o ooo oooooo

oo oooooooo

ooo o oo oo

o

oo

o

oooo oo

o o

o ooo

oo

oo oo

oooo oo

o

oo oo

oooo

o oo

o

oo

o o oo ooooo

oooo

o

ooo

o oo o

o

ooo

ooo oo

ooo

oo oo

o ooo

ooo

o

o

ooo oooo o

o oooo oo

oo

o

o

ooo

oo o o oo ooo o ooo

ooo

ooo oo o

oo

ooooo oo

oo

ooo

oo oooo o

o oooooo o

oo ooo oooo

o

o

oooo

o

ooo o

oooo oo

oo

oooooo

ooo

oo o o oooo

o

oo oo

oooo

ooo

o

o

ooooo

o oo o oo

ooo ooo o

o

o oooo oo

o ooo

oo

o o

o

oo o

oo

ooo ooooooooo

o oo

oo

oo

oo

oo

ooo

oooo

oo o

o

oo o o

o ooo ooooo

o oo

oo

o o

oo

−1.0 0.0 0.5 1.0

020

6010

0

7

ooooo oooo ooo

o

ooo o ooo

oo

oo ooooo o

ooo ooooo

oo

o oooo o o

oo

ooooo

oo

oo

ooo

o

ooo

oo

o oo

oo ooo

ooo

oo o oo oo ooo

ooo o ooo oooooooo oo

ooooo

oooo o ooo

o

oo

o

ooo oo o

o o

oo oo

o o

ooo o

oo ooo o

o

ooooo oo

oo o

o

o

oooo o oo o

oooooo

o

o

o oo

o o oo

o

ooo

oooo o

oooooo

o

o o oo

oooo

o

oooo oo oooo oooo o

oo

o

o

ooo

ooo o o oo ooo o oo

ooo

oooo oo

oo

oooooo o

oo

oooo

oo ooooo o oooooo

ooo oooooo

o

o

ooo o

o

oo oo

oooooo

oo

oo oooo

ooo

ooo o o o

ooo

ooooo

oooo

ooo

o

oooo o

o ooo o o

ooo o ooo

o

oo oooo o

oooo

oo

oo

o

ooo

oo

o oooo ooooooo

ooo

oo

oo

ooo

ooo

oo

ooo

ooo

o

ooo o

oo ooo ooo

oo o

oo o

oo

oo

o

−1.0 0.0 0.5 1.0

020

6010

0

8

ooooo oooo

ooo

oo oo o oo

ooo

o o ooooo

oooo oooo

oo

oooo

oo oo

ooooo o

oo

oo

ooo

o

o oo

o o

ooo

ooo oo

ooo

ooo o o

o oo oooooo o ooo ooo

ooo

oo ooooo

oooooo o o

oo

oo

o

oooo oo

oo

o ooo

oo

oo oo

oo o oo

oo

oooo

oo ooooo

o

oo

ooo o oo

ooooo

oo

o

o oo

oo o o

o

oooo

ooooo

o oooo

o

oo oo

oo o

o

o

ooooooo

oo oo

ooooo

oo

o

oo o

oooo o o oo ooo oo

o oo

ooooo o

oo

ooooooo

oo

ooo

oooo ooo

o o o ooooo

o ooo ooo oo

o

o

oooo

o

oo o o

o ooooo

oo

ooo ooo

ooo

o ooo o ooo

o

o ooo

oo

oooooo

o

ooooo

ooo oo o

oooo o ooo

o oooooo

ooo o

oo

o o

o

ooo

oo

oo ooo o oooooo

ooo

ooo

oo

ooo

o oo

oo

oo

ooo

o

oooo

o ooooo oo

ooo

ooo

oo

oo

oo

−1.0 0.0 0.5 1.0

020

6010

0

9

ooooo oooo

oo

oo o oo o o

oo

oo o o oooo

o oooo ooo

oo

ooo oooo

oo

o oooo

oo

oo

o oo

o

ooo

oo

ooo

oooo o

oooo

o oo ooo oo oooooo o ooo ooo

ooooo ooooooooooo o

oo

oo

o

o oooo o

o o

o o oo

oo

ooo o

ooo o ooo

ooooo oo

oooo

o

oo

oooo o o

o oooooo

o

ooo

ooo o

o

o oo

oo ooo

ooo

oooo

o ooo

ooo

o

o

oooooo oooo o

o oooo

oo

o

ooo

ooooo o o oo oooo

ooo

oo oooo

oo

oo ooooo

ooo

o oooooo oo

oo o o oooo

oo ooo ooo o

o

o

o ooo

o

oo o o

oo oooo

oo

oo oo oo

ooo

oo ooo oo oo

oo oo

ooo

ooooo

o

oooooo o

o o ooo o

ooo o oo

o o oo oooo o

oooo

o o

o

o oo

oo

ooo ooo o ooooo

ooo

oo

oo

oo

ooooo

oo

o o

ooo

o

oooo

o o oo ooo o

ooo

oo o

o o

oo

ooo

−1.0 0.0 0.5 1.0

020

6010

0

10

ooooo ooo

oo

ooo o oo o

oo

ooo o o ooo

oo oooo oo

oooo

ooooo

oo

oo ooo

oo

oo

ooo

o

ooo

oo

o oo

oo ooo

oooooo oo

o oo oo oooooo o ooo oo

oooooo oooo

oooooooo

o

oo

o

oooooo

oo

oo oo

o o

oo ooo

o oo o oo

o ooo

oo oo

ooo

o

oo

o oooo ooo o

oooo

o

ooo

o ooo

o

ooo

ooo oo

oo o

o ooo

oo oo

ooo

o

o

ooooooooo oo

oo ooo

oo

o

ooo

o ooooo o o oo ooo

o oo

ooo ooo

oo

ooo oooo

oo

oo o

oooooo o

ooo o o ooo

ooo oooooo

o

o

oooo

o

ooo o

o oo ooo

oo

oo o oo o

ooo

ooo ooo

o oo

oooo

oo

oo

oooo

o

ooooo

oooo o o

o ooooo o

o

oo ooo oo

ooooo

ooo

o

oo o

oo

o oooooo o oooo

ooo

oo

ooo

oo

ooo

oo

oo o

ooo

o

oo oo

oo ooo oooooo

oo o

oo

oo

o ooo

−1.0 0.0 0.5 1.0

020

6010

0

11

ooooo ooo

o

oooo o oo

oo

oo oo o o oo

ooo ooooo

oo

oooo

o oooo

o oooo

oo

oo

ooo

o

ooo

o o

ooo

oo o oo

o oo

oooo o

o o oooo oooooo o ooo

ooo

ooooo oo

oooooooo

oo

oo

o

o oo ooo

o o

o ooo

oo

ooo o

ooo oo o

o

oo ooooooo o

o

o

oo

oo ooooo oo

ooo

o

o

oooo o oo

o

o oo

oooo oooo

oo oo

oooo

oo o

o

o

oooooooooo o

o oo oo

oo

o

oo ooo ooooo o o oo ooo o

o

oooo oo

oo

oo oo ooo

oo

ooo

ooooooo

oooo o o oo

oooo ooo oo

o

o

ooo o

o

oo oo

o o oo oo

oo

ooo o oo

ooo

oooo ooo o

o

oooo

oo

ooo

ooo

o

ooooo

ooo oo o

ooo oooo

o

oooo oo o

ooo o

oo

oo

o

o oo

oo

oo ooo ooo o ooo

ooo

oo

oo

oo

oo

ooo

oo

oo

oooo

ooo o

oooo oo ooo

ooooo

o o

oo

ooooo

−1.0 0.0 0.5 1.0

020

6010

0

12

oooooo

oo

ooooo o o

oo

ooo oo o o o

oooo oooo

oooo

oooo o

ooo o o

o oo

o

oo

o oo

o

ooo

oo

ooo

ooo o o

oooo

oooooo o oo

oo oooooo o ooo

oooooooo o

oooooooo

oo

oo

o

oooo oo

oo

oo oo

o o

oooo

oo oo oo

o

oooo

oooo

oo o

o

oo

o oo oooo o o

o ooo

o

ooo

oo o o

o

o oo

ooooo

oooo oo

o

oooo

oo o

o

o

o oooooooo oo

oo ooo

oo

o

oo o

ooo ooooo o o ooo

ooo

oo ooo o

oo

ooo oo oo

ooo

o oo

o oooooo oooo o o o

ooooo o

oo oo

o

oooo

o

ooo o

o o o ooo

oo

oo oo o o

o oo

ooooo o

ooo

o ooo

oo

oo

oo o

o

o

ooooo

oooo oo

o oo o ooo

o

o ooo o oo

oooo

oo

o o

o

o o o

oo

ooo ooo ooo o oo

ooo

ooo

oo

oo

ooo

oo

ooo

oo o

o

oooo

oooo o oo o

oo o

ooo

o o

oo

ooooo

o

−1.0 0.0 0.5 1.0

020

6010

0

13

oooo

oo

o

oo oooo o

oo

oooo oo o o

ooooo ooo

oo

oooo

oooo

ooo o

ooo

o

oo

ooo

o

o oo

oo

o oo

oooo o

ooo

oooooo oo o oo

oo oooooo o oo

o oooooooooooooooo

oo

oo

o

o oo oo o

oo

o ooo

o o

oo oo

ooo oo o

o

o oooooo

oo o

o

o

oo

o o oo oooo o

oooo

o

oooooo o

o

ooo

oo ooo

ooo

oo oo

oooo

ooo

o

o

oo oooooo

oo oo oo o

oo

o

o

oooo ooo ooooo o o oo

ooo

ooo ooo

oo

oo oo oo o

oo

ooo

oo o oooo

oo oooo o o

ooooooooo

o

o

o ooo

o

oooo

oo o o oo

oo

ooo oo o

ooo

oooooo

ooo

o o oo

oo

oo

ooo

o

o

ooooo

ooooo o

o ooo o oo

o

o o ooo o oo o

ooo

ooo

o

oo o

oo

ooooooo ooo o o

ooo

oo

oo

oo

oo

ooo

oo

o o

oo o

o

oooo

o oooo o oo

ooo

ooo

oo

ooo ooo oo

o

−1.0 0.0 0.5 1.0

020

6010

0

14ooo

oo

o

ooo oooo

oo

oo ooo oo

oo ooooo o

o

oo

oooo

oooo

oooo o o

oo

oo

o oo

o

ooo

oo

ooo

oo ooo

o ooo

oooooo oo o oo oo oooooo oooo ooo

ooooo

oooooooo

o

oo

o

oooo oo

oo

oo oo

oo

ooo o

oo oo oo

o

o o oo

o ooo

ooo

o

oooo o oo o

oooo oo o

o

ooooooo

o

ooo

ooo oo

oo o

oooo

o ooo

oo o

o

o

ooo ooooo

ooooo oo

oo

o

o

oo o

o o ooo ooooo o o

oo oo

oo oo oo

oo

ooo oo oo

oo

ooo

ooo o ooo

ooo oooo o

o ooooo

o ooo

o

oooo

o

oooo

o oo o oo

oo

oooo oo

o oo

oooooo

o oo

o o oo

oo

oo

oo o

o

o

ooooooo

oooo ooo oo o oo

oo oooo ooo

ooo

ooo

o

ooo

oo

o oooo ooo ooo o

ooo

ooo

oo

oo

oo o

oo

ooo

ooo

o

oooo

oo oooo o o

ooo

ooo

oo

oo

ooooo

o

oo

−1.0 0.0 0.5 1.0

020

6010

0

15

ooo

oo

oooo ooo

oo

oo o ooo oo

o o oooooo

ooo o

ooooo

oo

oooo oo

o

oo

ooo

o

ooo

oo

ooo

ooo oo

o oo

oo ooo

ooo oo o oo oo ooooooo

ooo oooooo

oo oooooo

oo

oo

o

o oo oo o

o o

oooo

o o

oo oo

ooo oo o

o

oo oooo o

ooo

o

o

oooo

o o ooooo

o o oo

o

ooooooo

o

o oo

oo oo o

ooo

oooo

oo ooo

ooo

o

o ooo oooo

oooo oo o

ooo

o

ooo

oo o ooo ooooo oo

ooo

oo o oo o

oo

oooo oo o

oo

ooo

oo oo o oo

oooo oooo

o o oooooo o

o

o

ooo o

o

oo oo

oo oo oo

oo

ooooo o

o oo

o ooooo

ooo

oo oo

oo

oo

oooo

o

ooooooo

ooooo o

o o oo oo

oooo ooo

o oo o

oo

o o

o

o oo

oo

oooo

oo ooo oooo o

o

oo

oo

oo

oo

ooo

oo

oo

ooo

o

ooooooo

oooo oo

o ooo o

oo

oo

o oooo

o

oo

o

−1.0 0.0 0.5 1.00

2060

100

16

Figure 2.6: Multiple lagged scatterplots showing the relationship between the SOI at timet, say xt (x-axis) versus recruits at time t+ h, say yt+h (y-axis), 0 ≤ h ≤ 15.

correlation at lag 6, for example, is −0.60 implying that increases in the SOI lead decreases

in number of recruits by about 6 months. On the other hand, the series are hardly correlated

(0.025) at all in the conventional sense, measured at lag h = 0. The general pattern suggests

that predicting recruits might be possible using the El Nino at lags of 5, 6 ,7, . . . months.

We show in the right panels panel of Figure 2.7 the partial autocorrelation functions of

the SOI series (left panel) and the recruits series (right panel). Note that the PACF of the

SOI has a single peak at lag h = 1 and then relatively small values. This means, in effect,

that fairly good prediction can be achieved by using the immediately preceding point and

that adding further values does not really improve the situation. Hence we might try an

autoregressive model with p = 1. The recruits series has two peaks and then small values,

implying that the pure correlation between points is summarized by the first two lags.

Page 40: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 30

0 5 10 15 20 25 30

−0.5

0.00.5

1.0

PACF of SOI

0 5 10 15 20 25 30

−0.5

0.00.5

1.0

PACF of Recruits

Figure 2.7: Partial autocorrelation functions for the SOI (left panel) and the recruits (rightpanel) series.

Table 2.1: AICC values for ten models for the recruits series

p 1 2 3 4 5 6 7 8 9 10AICC 5.75 5.52 5.53 5.54 5.54 5.55 5.55 5.56 5.57 5.58

We consider the simple problem of modeling the recruit series shown in the right panel

of Figure 2.1 using an autoregressive model. The top right panel of Figure 2.4 and the right

panel of Figure 2.7 shows the autocorrelation and partial autocorrelation functions of the

recruit series. The PACF has large values for h = 1 and 2 and then is essentially zero for

higher order lags. This implies by the property of an autoregressive model that a second

order (p = 2) AR model might provide a good fit. Running the regression program for an

AR(2) model with intercept

xt = φ0 + φ1 xt−1 + φ2 xt−2 + wt

leads to the estimators φ0 = 61.8439(4.0121), φ1 = 1.3512(0.0417), φ2 = −0.4612(0.0416)

and σ2 = 89.53, where the estimated standard deviations are in parentheses. To determine

whether the above order is the best choice, we fitted models for 1 ≤ p ≤ 10, obtaining

corrected AIC (AICC) values summarized in Table 2.1 using (2.3). This shows that the

minimum AICC is obtained for p = 2 and we choose the second order model.

The previous example uses various autoregressive models for the recruits series, for exam-

ple, one can fit a second-order regression model. We may also use this regression idea to fit

Page 41: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 31

the model to other series such as a detrended version of the SOI given in previous discussions.

We have noted in our discussions of Figure 2.7 from the partial autocorrelation function that

a plausible model for this series might be a first order autoregression of the form given above

with p = 1. Again, putting the model above into the regression framework (1.2) for a single

coefficient leads to the estimators φ1 = 0.59 with standard error 0.04, σ2 = 0.09218 and

AICC(1) = −1.375. The ACF of these residuals shown in the left panel of Figure 2.8, how-

ever, will still show cyclical variation and it is clear that they still have a number of values

exceeding the 1.96/√n threshold. A suggested procedure is to try higher order autoregressive

0 5 10 15 20

−0.5

0.00.5

1.0

ACF of residuls of AR(1) for SOI

oo o

o

o o o o

o

o

o oo

o

o o o o o o o oo o o

o o o oo

0 5 10 15 20 25 30

−1.70

−1.65

−1.60

−1.55

−1.50

−1.45

−1.40

−1.35

Lag

oo o

o

o o o o

o

o

o oo

o

o o o o o o o oo o o

oo

o oo

AICAICC

Figure 2.8: ACF of residuals of AR(1) for SOI (left panel) and the plot of AIC and AICCvalues (right panel).

models and successive models for 1 ≤ p ≤ 30 were fitted and the AIC and AICC values are

plotted in the right panel of Figure 2.8. There is a clear minimum for a p = 16 order model.

The coefficient vector is φ with components and their standard errors in the parentheses

0.4050(0.0469), 0.0740(0.0505), 0.1527(0.0499), 0.0915(0.0505), −0.0377(0.0500), −0.0803(0.0493),

−0.0743(0.0493), −0.0679(0.0492), 0.0096(0.0492), 0.1108 (0.0491), 0.1707(0.0492), 0.1606(0.0499),

0.0281(0.0504), −0.1902(0.0501), −0.1283(0.0510), −0.0413(0.0476), and σ2 = 0.07166.

Exercise: Please use LASSO type approach (you can choose any one) to re-run this example

again to see what you can obtain. As for the R command, you can use the package lasso2

or lars.

Page 42: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 32

2.8 Computer Codes

#####################################################################

# This is Example 2.1 for Southern Oscillation Index and Recruits data

####################################################################

y<-read.table("c:/res-teach/xiamen12-06/data/ex2-1a.txt",header=F)

# read data file

x<-read.table("c:/res-teach/xiamen12-06/data/ex2-1b.txt",header=F)

y=y[,1]

x=x[,1]

postscript(file="c:/res-teach/xiamen12-06/figs/fig-2.1.eps",

horizontal=F,width=6,height=6)

#win.graph()

par(mfrow=c(1,2),mex=0.4,bg="yellow")

# save the graph as a postscript file

ts.plot(y,type="l",lty=1,ylab="",xlab="")

# make a time series plot

title(main="Southern Oscillation Index",cex=0.5)

# set up the title of the plot

abline(0,0)

# make a straight line

#win.graph()

ts.plot(x,type="l",lty=1,ylab="",xlab="")

abline(mean(x),0)

title(main="Recruit",cex=0.5)

dev.off()

n=length(y)

n2=n-12

yma=rep(0,n2)

for(i in 1:n2)yma[i]=mean(y[i:(i+12)]) # compute the mean

Page 43: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 33

yy=y[7:(n2+6)]

yy0=yy-yma

postscript(file="c:/res-teach/xiamen12-06/figs/fig-2.2.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(1,2),mex=0.4)

ts.plot(yy,type="l",lty=1,ylab="",xlab="")

points(1:n2,yma,type="l",lty=1,lwd=3,col=2)

ts.plot(yy0,type="l",lty=1,ylab="",xlab="")

points(1:n2,yma,type="l",lty=1,lwd=3,col=2) # make a point plot

abline(0,0)

dev.off()

m=17

n1=n-m

y.soi=rep(0,n1*m)

dim(y.soi)=c(n1,m)

y.rec=y.soi

for(i in 1:m)

y.soi[,i]=y[i:(n1+i-1)]

y.rec[,i]=x[i:(n1+i-1)]

text_soi=c("1","2","3","4","5","6","7","8","9","10","11","12","13",

"14","15","16")

postscript(file="c:/res-teach/xiamen12-06/figs/fig-2.3.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(4,4),mex=0.4,bg="light blue")

for(i in 2:17)

plot(y.soi[,1],y.soi[,i],type="p",pch="o",ylab="",xlab="",

ylim=c(-1,1),xlim=c(-1,1))

text(0.8,-0.8,text_soi[i-1],cex=2)

dev.off()

Page 44: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 34

text1=c("ACF of SOI Index")

text2=c("ACF of Recruits")

text3=c("CCF of SOI and Recruits")

SOI=y

Recruits=x

postscript(file="c:/res-teach/xiamen12-06/figs/fig-2.4.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(2,2),mex=0.4,bg="light pink")

acf(y,ylab="",xlab="",ylim=c(-0.5,1),lag.max=50,main="")

# make an ACF plot

legend(10,0.8, text1) # set up the legend

acf(x,ylab="",xlab="",ylim=c(-0.5,1),lag.max=50,main="")

legend(10,0.8,text2)

ccf(y,x, ylab="",xlab="",ylim=c(-0.5,1),lag.max=50,main="")

legend(-40,0.8,text3)

dev.off()

postscript(file="c:/res-teach/xiamen12-06/figs/fig-2.5.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(4,4),mex=0.4,bg="light green")

for(i in 1:16)

plot(y.soi[,i],y.rec[,1],type="p",pch="o",ylab="",xlab="",

ylim=c(0,100),xlim=c(-1,1))

text(-0.8,10,text_soi[i],cex=2)

dev.off()

postscript(file="c:/res-teach/xiamen12-06/figs/fig-2.6.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(4,4),mex=0.4,bg="light grey")

for(i in 1:16)

plot(y.soi[,1],y.rec[,i],type="p",pch="o",ylab="",xlab="",

Page 45: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 35

ylim=c(0,100),xlim=c(-1,1))

text(-0.8,10,text_soi[i],cex=2)

dev.off()

postscript(file="c:/res-teach/xiamen12-06/figs/fig-2.7.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(1,2),mex=0.4,bg="light blue")

pacf(y,ylab="",xlab="",lag=30,ylim=c(-0.5,1),main="")

text(10,0.9,"PACF of SOI")

pacf(x,ylab="",xlab="",lag=30,ylim=c(-0.5,1),main="")

text(10,0.9,"PACF of Recruits")

dev.off()

################################################################

x<-read.table("c:/res-teach/xiamen12-06/data/ex2-1a.txt",header=T)

x.soi=x[,1]

n=length(x.soi)

aicc=0

if(aicc==1)

aic.value=rep(0,30) # max.lag=30

aicc.value=aic.value

sigma.value=rep(0,30)

for(i in 1:30)

fit3=arima(x.soi,order=c(i,0,0)) # fit an AR(i)

aic.value[i]=fit3$aic/n-2 # compute AIC

sigma.value[i]=fit3$sigma2

# obtain the estimated sigma^2

aicc.value[i]=log(sigma.value[i])+(n+i)/(n-i-2) # compute AICC

print(c(i,aic.value[i],aicc.value[i]))

data=cbind(aic.value,aicc.value)

write(t(data),"c:/res-teach/xiamen12-06/soi_aic.dat",ncol=2)

Page 46: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 36

else

data<-matrix(scan("c:/res-teach/xiamen12-06/soi_aic.dat"),byrow=T,ncol=2)

text4=c("AIC", "AICC")

fit11=arima(x.soi,order=c(1,0,0))

resid1=fit11$residual

postscript(file="c:/res-teach/xiamen12-06/figs/fig-2.8.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(1,2),mex=0.4,bg="light yellow")

acf(resid1,ylab="",xlab="",lag.max=20,ylim=c(-0.5,1),main="")

text(10,0.8,"ACF of residuls of AR(1) for SOI")

matplot(1:30,data,type="b",pch="o",col=c(1,2),ylab="",xlab="Lag",cex=0.6)

legend(16,-1.40,text4,lty=1,col=c(1,2))

dev.off()

#fit2=arima(x.soi,order=c(16,0,0))

#print(fit2)

2.9 References

Akaike, H. (1973). Information theory and an extension of the maximum likelihood princi-ple. In Proceeding of 2nd International Symposium on Information Theory (V. Petrovand F. Csaki, eds.) 267281. Akademiai Kiado, Budapest.

Bai, Z., C.R. Rao and Y. Wu (1999). Model selection with data-oriented penalty. Journalof Statistical Planning and Inferences, 77, 103-117.

Burnham, K.P. and D. Anderson (2003). Model Selection And Multi-Model Inference: APractical Information Theoretic Approach, 2nd edition. New York: Springer-Verlag.

Efron, B., T. Hastie, I. Johnstone and R. Tibshirani (2004). Least angle regression. TheAnnals of Statistics, 32, 407-451.

Eicker, F. (1967). Limit theorems for regression with unequal and dependent errors. InProceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability(L. LeCam and J. Neyman, eds.), University of California Press, Berkeley.

Page 47: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 2. CLASSICAL AND MODERN MODEL SELECTION METHODS 37

Fan and Li (2001). Variable selection via nonconcave penalized likelihood and its oracleproperties. Journal of the American Statistical Association, 96, 1348-1360.

Fan, J. and H. Peng (2004). Nonconcave penalized likelihood with a diverging number ofparameters. Annals of Statistics, 32, 928-961.

Frank, I.E. and J.H. Friedman (1993). A statistical view of some chemometric regressiontools (with discussion). Technometrics, 35, 109-148.

Hastie, T.J. and R.J. Tishirani (1990). Generalized Additive Models. London: Chapmanand Hall.

Hurvich, C.M. and C.-L.Tsai (1989). Regression and time series model selection in smallsamples. Biometrika, 76, 297-307.

Schwarz, G. (1978). Estimating the dimension of a model. The Annals of Statistics, 6,461-464.

Shao, J. (1993). Linear model selection by cross-validation. Journal of the AmericanStatistical Association, 88, 486-494.

Shen, X.T. and J.M. Ye (2002). Adaptive model selection. Journal of the American Sta-tistical Association, 97, 210-221.

Shibata (1976). Biometrika, 63, 117-126.

Shumway, R.H. (1988). Applied Statistical Time Series Analysis. Englewood Cliffs, NJ:Prentice-Hall.

Shumway, R.H. (2006). Lecture Notes on Applied Time Series Analysis. Department ofStatistics, University of California at Davis.

Shumway, R.H. and D.S. Stoffer (2000). Time Series Analysis & Its Applications. NewYork: Springer-Verlag.

Tibshirani, R.J. (1996). Regression shrinkage and selection via the Lasso. Journal of theRoyal Statistical Society, Series B, 58, 267-288.

Wang, H., R. Li and C.-L. Tsai (2007). Tuning parameter selectors for the smoothly clippedabsolute deviation method. Biometrika, 94,553-568.

Page 48: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

Chapter 3

Regression Models With CorrelatedErrors

3.1 Methodology

In many applications, the relationship between two time series is of major interest. The

market model in finance is an example that relates the return of an individual stock to

the return of a market index. The term structure of interest rates is another example in

which the time evolution of the relationship between interest rates with different maturities

is investigated. These examples lead to the consideration of a linear regression in the form

yt = β1 + β2 xt + et,

where yt and xt are two time series and et denotes the error term. The least squares (LS)

method is often used to estimate the above model. If et is a white noise series, then the

LS method produces consistent estimates. In practice, however, it is common to see that the

error term et is serially correlated. In this case, we have a regression model with time series

errors, and the LS estimates of β1 and β2 may not be consistent and efficient.

Regression model with time series errors is widely applicable in economics and finance, but

it is one of the most commonly misused econometric models because the serial dependence

in et is often overlooked. It pays to study the model carefully. The standard method for

dealing with correlated errors et in the regression model

yt = βT zt + et

is to try to transform the errors et into uncorrelated ones and then apply the standard least

squares approach to the transformed observations. For example, let P be an n × n matrix

38

Page 49: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 39

that transforms the vector e = (e1, · · · , en)T into a set of independent identically distributed

variables with variance σ2. Then, transform the matrix version (1.4) to

Py = PZβ + Pe

and proceed as before. Of course, the major problem is deciding on what to choose for P

but in the time series case, happily, there is a reasonable solution, based again on time series

ARMA models. Suppose that we can find, for example, a reasonable ARMA model for the

residuals, say, for example, the ARMA(p, 0, 0) model

et =

p∑

k=1

φk et + wt,

which defines a linear transformation of the correlated et to a sequence of uncorrelated wt.

We can ignore the problems near the beginning of the series by starting at t = p. In the

ARMA notation, using the back-shift operator B, we may write

φ(L) et = wt, (3.1)

where

φ(L) = 1 −p∑

k=1

φk Lk (3.2)

and applying the operator to both sides of (1.2) leads to the model

φ(L) yt = φ(L) zt + wt, (3.3)

where the wt’s now satisfy the independence assumption. Doing ordinary least squares

on the transformed model is the same as doing weighted least squares on the untransformed

model. The only problem is that we do not know the values of the coefficients φk (1 ≤ k ≤ p)

in the transformation (3.2). However, if we knew the residuals et, it would be easy to estimate

the coefficients, since (3.2) can be written in the form

et = φT et−1 + wt, (3.4)

which is exactly the usual regression model (1.2) with φ = (φ1, · · · , φp)T replacing β and

et−1 = (et−1, et−2, · · · , et−p)T replacing zt. The above comments suggest a general approach

known as the Cochran-Orcutt (1949) procedure for dealing with the problem of correlated

errors in the time series context.

Page 50: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 40

1. Begin by fitting the original regression model (1.2) by least squares, obtaining β and

the residuals et = yt − βTzt.

2. Fit an ARMA to the estimated residuals, say φ(L) et = θ(L)wt.

3. Apply the ARMA transformation found to both sides of the regression equation (1.2)

to obtainφ(L)

θ(L)yt = βT φ(L)

θ(L)zt + wt.

4. Run an ordinary least squares on the transformed values to obtain the new β.

5. Return to 2. if desired.

Often, one iteration is enough to develop the estimators under a reasonable correlation

structure. In general, the Cochran-Orcutt procedure converges to the maximum likelihood

or weighted least squares estimators.

Note that there is a function in R to compute the Cochrane-Orcutt estimator

arima(x, order = c(0, 0, 0),

seasonal = list(order = c(0, 0, 0), period = NA),

xreg = NULL, include.mean = TRUE, transform.pars = TRUE,

fixed = NULL, init = NULL, method = c("CSS-ML", "ML", "CSS"),

n.cond, optim.control = list(), kappa = 1e6)

by specifying “xreg=...”, where xreg is a vector or matrix of external regressors, which must

have the same number of rows as “x”.

Example 3.1: The data shown in Figure 3.1 represent quarterly earnings per share for the

American Company Johnson & Johnson from the from the fourth quarter of 1970 to the first

quarter of 1980. We might consider an alternative approach to treating the Johnson and

Johnson earnings series, assuming that yt = log(xt) = β1 + β2 t+ et. In order to analyze the

data with this approach, first we fit the model above, obtaining β1 = −0.6678(0.0349) and

β2 = 0.0417(0.0071). The computed residuals et = yt − β1 − β2 t can be computed easily,

the ACF and PACF are shown in the top two panels of Figure 3.2. Note that the ACF and

PACF suggest that a seasonal AR series will fit well and we show the ACF and PACF of

these residuals in the bottom panels of Figure 3.2. The seasonal AR model is of the form

Page 51: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 41

0 20 40 60 80

05

1015

J&J Earnings

0 20 40 60 80

01

2

transformed log(earnings)

Figure 3.1: Quarterly earnings for Johnson & Johnson (4th quarter, 1970 to 1st quarter,1980, left panel) with log transformed earnings (right panel).

0 5 10 15 20 25 30−0.5

0.00.5

1.0

ACF

detrended

0 5 10 15 20 25 30−0.5

0.00.5

1.0

PACF

0 5 10 15 20 25 30−0.5

0.00.5

1.0

ARIMA(1,0,0,)_4

0 5 10 15 20 25 30−0.5

0.00.5

1.0

Figure 3.2: Autocorrelation functions (ACF) and partial autocorrelation functions (PACF)for the detrended log J&J earnings series (top two panels)and the fitted ARIMA(0, 0, 0) ×(1, 0, 0)4 residuals.

et = Φ1 et−4 +wt and we obtain Φ1 = 0.7614(0.0639), with σ2w = 0.00779. Using these values,

we transform yt to

yt − Φ1 yt−4 = β1(1 − Φ1) + β2[t− Φ1(t− 4)] + wt

using the estimated value Φ1 = 0.7614. With this transformed regression, we obtain the

new estimators β1 = −0.7488(0.1105) and β2 = 0.0424(0.0018). The new estimator has the

Page 52: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 42

advantage of being unbiased and having a smaller generalized variance.

To forecast, we consider the original model, with the newly estimated β1 and β2. We

obtain the approximate forecast for ytt+h = β1+ β2(t+h)+e

tt+h for the log transformed series,

along with upper and lower limits depending on the estimated variance that only incorporates

the prediction variance of ett+h, considering the trend and seasonal autoregressive parameters

as fixed. The narrower upper and lower limits (The figure is not presented here) are mainly

a refection of a slightly better fit to the residuals and the ability of the trend model to take

care of the nonstationarity.

Example 3:2: We consider the relationship between two U.S. weekly interest rate series: xt:

the 1-year Treasury constant maturity rate and yt: the 3-year Treasury constant maturity

rate. Both series have 1967 observations from January 5, 1962 to September 10, 1999 and

are measured in percentages. The series are obtained from the Federal Reserve Bank of St

Louis.

Figure 3.3 shows the time plots of the two interest rates with solid line denoting the

1-year rate and dashed line for the 3-year rate. The left panel of Figure 3.4 plots yt versus

1970 1980 1990 2000

46

810

1214

16

Figure 3.3: Time plots of U.S. weekly interest rates (in percentages) from January 5, 1962to September 10, 1999. The solid line (black) is the Treasury 1-year constant maturity rateand the dashed line the Treasury 3-year constant maturity rate (red).

xt, indicating that, as expected, the two interest rates are highly correlated. A naive way to

describe the relationship between the two interest rates is to use the simple model, Model I:

yt = β1 + β2 xt + et. This results in a fitted model yt = 0.911 + 0.924xt + et, with σ2e = 0.538

Page 53: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 43

ooooooooooo

oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo

oooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo

ooooooooooo

ooooooooo

oooooooooo

ooooooooo

ooooooooooooooooooooooooooooooooooo

ooooooooooooo

oooooooo

ooooooooooooooooooo

oooooo

oooooooooooooooooooo

oooo

oooooooooooooo

o

ooooooooooooo

ooo

oo

oo

ooooooo

oooooo

oooo

oooooooo

oooo

ooooooo

o

o

ooooooo

oooo

ooooooo

oo

oo

ooo

oooo

ooooo

oooo

ooooooooooooooooooo

ooooooooooo

oooooo

oooooooooooooooooo

oooooooooooooooooooooooooooo

ooooooooooooooo

ooo

oooo

oo

oooo

oooooo

ooo

oooooooooo

oooooo

oooooo

oo

oooooo

ooo

oo

ooooooo

ooo

ooooooooo

ooooooo

ooooooooo

ooooo

ooooo

oo

oooo

ooooooooooo

ooo

oooooooo

oooooooooo

ooo

oooooooooooooooo

oooooooooooo

oooooo

oooooooo

ooooooooooooo

oooo

ooooooooooooooooooo

oooooooooooooooo

ooooooooooooooooooo

oooooooooo

oooooooo

ooooo

ooooooo

oooooooooooooooooo

oooooo

ooooooooooo

oo

ooooo

o

o

ooo

oo

ooo

ooooooo

oo

o

oo

oo

o

o

o

o

o

o

ooooo

oo

ooooo

oo

o

o

o

oo

ooo

oo

o

o

oo

oo

o

oo

oooo

ooo

o

ooo

o

o

oo

oo

oo

ooo

oooooo

oo

oo

oo

o

ooo

oo

o

o

oo

o

o

o

oo

o

o o

ooo

oo

oooo

ooo

oooo

ooooooooo

o

oo

o

o

oo

oo

ooooo

o

o

o

ooo

oooooooooo

oooooo

oo

ooooo

ooooo

ooooo

oooooo

oo

oo

ooo

oooo

oooooooooooooooooooooo

ooooooo

oooo

ooooo

ooooooooo

oooo

oo

oo

oooooo

ooooo

ooooo

oooo

ooo

oooo

ooo

ooo

ooooooooooooo

oooooooo

ooo

ooooooo

ooooo

oooo

oo

o

ooooooo

ooo

oooooo

oooooooo

ooooooooooooo

oooooooooooooo

ooooo

oooo

oooooooooooo

ooooo

o

o

oooooooooooo

ooooooooooooooooooooooooooooo

oooooooooooooooooo

ooooooooooooo

ooo

ooooooooo

ooooooooo

ooooo

oooo

oooooooooooooooooooooooooooooo

ooooooooooooooooooooo

oooooooooo

ooooooooooooooooooooooooooooooooo

oo

ooooooooooooooooo

ooooo

ooooooooooo

oooooo

oooooooo

ooooooooo

oooooo

ooooooooooooooooo

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo

oo

oooooooooooooooooooooooooooo

ooooooooo

oooooo

oooooooo

oooooooooooooooo

oooooooooooooooooooo

ooooo

ooo

ooooooooooooooooooooo

oooooooooooooooooo

oooooooooooooooooooooooooooooooooo

oooooooooooooooooooooo

ooooooooooooooooooooooooooooooooo

oo

ooo

oooooooooooooooooooooooooooooooooooooooooooooooooo

4 6 8 10 12 14 16

46

810

1214

16

oo

oo

ooo

o

oo

o

o

o

oo

oo

o

oo

ooo

oo

ooo

ooooo

oo

ooo

o

ooooooooo

oooo

oo

o

ooooo

oooo

ooo

ooooo

oo

ooo

o ooo

oooooooooooooo oooooooooooooooo

ooooo

ooooooo

oooooooooo

ooooooo

oooooooooo

o

oooooooo

ooooooooooooooooooooo

o

ooooooooooooo

o

ooooo

o

ooo

oo

oo

o

o

ooooooooooo

o

oooo

oo

o

oo

oo

o

o

o

ooo

o

oo

o

o

oo

o

o

o

oo

ooo

ooo

o o

ooo o

o

o

o

o

o

oo

oo

ooo

oooooo

o

o

o

ooo

o

o o

o

o

o

o

oo

ooo

oooo

oo

ooo

ooo

o

oo

o oo

o

o

ooo

o

o

o

o

o

oo

ooo

o

oo

o

o

o

oo

o

o

oo

oooo

o

o

oo

o

ooooooo

o

o

oo

o

o

o

o

oo

o

ooo

ooooooo

o

oo

o oo oo

o

o

o

o

o

o

o

o

o

ooo

oo

o

o

o

o

o

oo

o

o

o

o

o

o

o

o

o

o

o

oo

ooo

o

o

o

o

o

o

o o

o

oo

o

o

oo

o

oo

oooo

o

o

o

o

o

oo

o

oo

o

o

o

o

o o

o

oo

o

o

o

oo

oo

oo

oo

o

o

o

oo

o

ooo

o

o

o

o

oo

o

o

oo

o

o

oo

o

o

o

ooo

oo

o

oo o

o

o

o

oooo

o

o

o

oo

o

oo

ooo

o

o

oo

oo

o

o

oo

o

oo

ooo

ooo oooo

o

o

o

ooo

oo

o

ooooooooooooooo

o

o

o

oooo

o

o

o

o oo

oo

oo

oo

o

o

o

o

o

oo

o

o

oo

o

o

oo

o

o

ooo

o

o

o

o

o

o

o

o

ooo

o

o

o

o

o

oo

o

o

oo

o

o

o

oo

o

o

oooo

ooo

oo

o

oo

o

o

oo oo

o

o

oo

o

ooo

oo

o

o

o

o

o

o

o

o

o

o

o

o

oo

oo

o

o

o

o

o

oo

o

o

o

o

o

o

o

oo

oo

o

oo

o

o

o

o

o

o

o

o

oo

o

o

o

oo

o

ooo

oo

o

oo

o

oo

o

o

o

o

o

oo

o

o

o

ooo

oo

oo

o

oo

o

o oo

ooooooo

oo

o

o

oo

o

o

o

oooo

o

o

oo

o

o

o

o

oooooo

o

o

ooo

oooo

o

oo

oooo

ooo

oo

oo

oo

o

o

o

oo

oo

ooooooo

o

ooo

oo

ooooo

o

ooo

o

ooo

oo

oo

o

o

oo

oo

oo

o

oo

o o

o

ooo

oo

o

o

o

o

o

o

o

o

o

ooo

o

o

oo

oo

ooo

oo

o

oo

o

o

o

o

oo

o

o

o

o

o oo

o o

oo

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o o

o

o

o

o

o

o

o

o

o

o

o

o

oo

o

o

o

o

o

o

o

o

o

o

o

o

ooo

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

oo

o

o

o

o

oo

o

o

o

o

o

o

o

o

o

o

o

o

o

o

oo

o

o

o

o

o

o

o

oo o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

oo

o

o

o

o

o

o

o

oo

o

o

o

o

o

o

o

oo

ooo

o

o o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o

oo

o

o

o

o

oo

o

o

oo

o

oo

o

o

o

o

o

o

o

o

o

o

o

o

oo

o

o

o

o

ooo

o

o

o

o

o

o

o

o

o

o

o

o

o

o

oo o

o

o

o

o

oo

oo

ooo

ooooo

oo

ooo

o

o o

o

o

o

o

oo

o

o

oo

o

o

o

oo

o

oo

ooo

o o

o

o

o

o

o

o

oo

o

o

o

o

o

o

o

o

o

o

o

oo

o

oo

o

o

o

o

oo

o

o

o

o

o o

ooo

o

o

o

o

o

o

o

o

o

o

o

o

o

o

o o

o

o

oo

oo

oo

o

o

oo

o

o

oo

oo

oo

oo

o

o

o

o

o

oo

oo

o

o

o

o

o

o

o

o

o

oo

o

o

oo

ooo

o

o

o

o

oo

o

o

o

o o

o

ooooo

o

o

o

oo

ooo

oo

oo

o

o

o

o

o

oooo

o

o

o

o

o

oo

o

o

ooo

o

o

o

oo

o

o

oo

o

o

o

ooo

o

o

o

ooooo

o oo

o

o

oo

o

o

o

oo

o

o

o

ooo

o

o

o

o

o

o

o

o

oo

o

o

oo

o

o

ooo

o

ooo

o

o

oo

o o

o

o

oo

o

o

oo

ooo

oo

oo

o

o

o

o

ooo

o

oo

o

oo

oo

o

o

oo

o

o

oo

o

o

o

o

o

o

o o

o

o

o

o

oooo

o

oo

o

oo

o

o

o

o

oo

o

ooo

oo

o

o

o

oo

o

o

o

o

o

o

o

o

o

o

o

o

oo

oo

o

o

o

oo

o

oo

oo

oo

oo

o

oo

oo

o

o

ooo

o

o

oo

o o

o

ooooo

o

o

oooo

oo

o

o

oo

o

oo ooo

oo

o

oo

o

o

o

oo

o

o

oo

o

o

o

oo

o

o

ooo

o

o

o

o

o

o

o

o

o

o

ooo

o

o

o

oo

oo

oo

o

oo

oo

o

o

o

o

oooooo

o

o

o

oooo

oo

o

o

oo

o

oo

o

o

o

o

o

o

o

oo

o

o

o

o

o

oo

o

oo

ooooo

o

o

ooo

oo

o

ooo

oo

o

oo

o

o

oo

o

o

o

ooo

oo

o

o

o

o

o

oo

o

o

o

o

ooo

oo

o

o

o

o

ooo

ooo

oo

o

o

o

oo

oo

o

o

o

oo

o

ooo

o

o

o

o

o

o

o

ooo

oo

o

o

o

o

o

o

oo

o

o

oo

o

o

oo

o

o

o

oo

o

o

o

ooo

o

ooo

oo

oo

o

oo

o

o

ooo

o

o

o

o

oo

o

o

o

o

oo

o

o

ooo

ooo

o

o

o

o

o

oo

o

o

oo

oo

ooo

oooooo

oo

o

oo

ooo

o

oo

o

o

o

oooo

oo

o

oooo

o

oo

o

o

o

ooo

o

o

o

o

oo

o

ooo

o

o

o

ooo

ooo

o

oo

o

o

oo

o

oo

o

o

oo

o

oo

ooo

o

ooooooo

oo

ooo

ooo

ooo

o

o

o

o

o

o

o

o

ooo

o

oo

o

o

oooo

o

oooo

o

o

oo

o

ooo

oo

o

o

o

o

o

o

o

o

o

o

ooo

oo

o

o

−1.5 −0.5 0.5 1.0 1.5

−1.0

−0.5

0.00.5

1.01.5

Figure 3.4: Scatterplots of U.S. weekly interest rates from January 5, 1962 to September 10,1999: the left panel is 3-year rate versus 1-year rate, and the right panel is changes in 3-yearrate versus changes in 1-year rate.

and R2 = 95.8%, where the standard errors of the two coefficients are 0.032 and 0.004,

respectively. This simple model (Model I) confirms the high correlation between the two

interest rates. However, the model is seriously inadequate as shown by Figure 3.5, which

1970 1980 1990 2000

−1.5

−1.0

−0.5

0.00.5

1.0

0 5 10 15 20 25 30

−0.5

0.00.5

1.0

Figure 3.5: Residual series of linear regression Model I for two U.S. weekly interest rates:the left panel is time plot and the right panel is ACF.

gives the time plot and ACF of its residuals. In particular, the sample ACF of the residuals is

highly significant and decays slowly, showing the pattern of a unit root nonstationary time

series1. The behavior of the residuals suggests that marked differences exist between the

two interest rates. Using the modern econometric terminology, if one assumes that the two

1We will discuss in detail on how to do unit root test later

Page 54: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 44

interest rate series are unit root nonstationary, then the behavior of the residuals indicates

that the two interest rates are not co-integrated; see later chapters for discussion of unit

root and co-integration. In other words, the data fail to support the hypothesis that

there exists a long-term equilibrium between the two interest rates. In some sense, this is

not surprising because the pattern of “inverted yield curve” did occur during the data span.

By the inverted yield curve, we mean the situation under which interest rates are inversely

related to their time to maturities.

The unit root behavior of both interest rates and the residuals leads to the consideration

of the change series of interest rates. Let ∆xt = yt − yt−1 = (1 − L)xt be changes in the

1-year interest rate and ∆ yt = yt − yt−1 = (1 − L) yt denote changes in the 3-year interest

rate. Consider the linear regression, Model II: ∆ yt = β1+β2 ∆xt +et. Figure 3.6 shows time

plots of the two change series, whereas the right panel of Figure 3.4 provides a scatterplot

1970 1980 1990 2000

−1.5

−1.0

−0.5

0.00.5

1.01.5

Figure 3.6: Time plots of the change series of U.S. weekly interest rates from January 12,1962 to September 10, 1999: changes in the Treasury 1-year constant maturity rate are indenoted by black solid line, and changes in the Treasury 3-year constant maturity rate areindicated by red dashed line.

between them. The change series remain highly correlated with a fitted linear regression

model given by ∆ yt = 0.0002 + 0.7811 ∆xt + et with σ2e = 0.0682 and R2 = 84.8%. The

standard errors of the two coefficients are 0.0015 and 0.0075, respectively. This model further

confirms the strong linear dependence between interest rates. The two top panels of Figure

3.7 show the time plot (left) and sample ACF (right) of the residuals (Model II). Once again,

the ACF shows some significant serial correlation in the residuals, but the magnitude of the

correlation is much smaller. This weak serial dependence in the residuals can be modeled by

Page 55: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 45

0 500 1000 1500 2000−0.4

−0.2

0.00.2

0.4

0 5 10 15 20 25 30−0.5

0.00.5

1.00 500 1000 1500 2000−0

.4−0

.20.0

0.20.4

0 5 10 15 20 25 30−0.5

0.00.5

1.0

Figure 3.7: Residual series of the linear regression models: Model II (top) and Model III(bottom) for two change series of U.S. weekly interest rates: time plot (left) and ACF (right).

using the simple time series models discussed in the previous sections, and we have a linear

regression with time series errors.

The main objective of this section is to discuss a simple approach for building a linear

regression model with time series errors. The approach is straightforward. We employ a

simple time series model discussed in this chapter for the residual series and estimate the

whole model jointly. For illustration, consider the simple linear regression in Model II.

Because residuals of the model are serially correlated, we identify a simple ARMA model for

the residuals. From the sample ACF of the residuals shown in the right top panel of Figure

3.7, we specify an MA(1) model for the residuals and modify the linear regression model to

(Model III): ∆ yt = β1 + β2 ∆xt + et and et = wt − θ1wt−1, where wt is assumed to be

a white noise series. In other words, we simply use an MA(1) model, without the constant

term, to capture the serial dependence in the error term of Model II. The two bottom panels

of Figure 3.7 show the time plot (left) and sample ACF (right) of the residuals (Model III).

The resulting model is a simple example of linear regression with time series errors. In

practice, more elaborated time series models can be added to a linear regression equation to

form a general regression model with time series errors.

Page 56: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 46

Estimating a regression model with time series errors was not easy before the advent

of modern computers. Special methods such as the Cochrane-Orcutt estimator have been

proposed to handle the serial dependence in the residuals. By now, the estimation is as easy

as that of other time series models. If the time series model used is stationary and invertible,

then one can estimate the model jointly via the maximum likelihood method or conditional

maximum likelihood method. For the U.S. weekly interest rate data, the fitted version of

Model II is ∆ yt = 0.0002 + 0.7824 ∆xt + et and et = wt + 0.2115wt−1 with σ2w = 0.0668

and R2 = 85.4%. The standard errors of the parameters are 0.0018, 0.0077, and 0.0221,

respectively. The model no longer has a significant lag-1 residual ACF, even though some

minor residual serial correlations remain at lags 4 and 6. The incremental improvement of

adding additional MA parameters at lags 4 and 6 to the residual equation is small and the

result is not reported here.

Comparing the above three models, we make the following observations. First, the high

R2 and coefficient 0.924 of Modle I are misleading because the residuals of the model show

strong serial correlations. Second, for the change series, R2 and the coefficient of ∆xt of

Model II and Model III are close. In this particular instance, adding the MA(1) model to

the change series only provides a marginal improvement. This is not surprising because the

estimated MA coefficient is small numerically, even though it is statistically highly significant.

Third, the analysis demonstrates that it is important to check residual serial dependence in

linear regression analysis. Because the constant term of Model III is insignificant, the model

shows that the two weekly interest rate series are related as yt = yt−1 + 0.782 (xt − xt−1) +

wt + 0.212wt−1. The interest rates are concurrently and serially correlated.

Finally, we outline a general procedure for analyzing linear regression models with time

series errors: First, fit the linear regression model and check serial correlations of the residuals

(see later). Second, if the residual series is unit-root nonstationary, take the first difference

of both the dependent and explanatory variables. Go to step 1. If the residual series appears

to be stationary, identify an ARMA model for the residuals and modify the linear regression

model accordingly. Third, perform a joint estimation via the maximum likelihood method

and check the fitted model for further improvement. All above steps can be implemented in

R with the command arima().

To check the serial correlations of residuals, we recommend that the Ljung-Box statistics

Page 57: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 47

should be used instead of the Durbin-Watson (DW) statistic because the latter only considers

the lag-1 serial correlation. There are cases in which residual serial dependence appears at

higher order lags. This is particularly so when the time series involved exhibits some seasonal

behavior.

Remark: For a residual series et with T observations, the Durbin-Watson statistic is

DW =T∑

t=2

(et − et−1)2/

T∑

t=1

e2t .

Straightforward calculation shows that DW ≈ 2(1− ρe(1)), where ρe(1) is the lag-1 ACF of

et.

Consider testing that several autocorrelation coefficients are simultaneously zero, i.e.

H0 : ρ1 = ρ2 = . . . = ρm = 0. Under the null hypothesis, it is easy to show (see, Box and

Pierce (1970)) that

Q = Tm∑

k=1

ρ2k −→ χ2

m. (3.5)

Ljung and Box (1978) provided the following finite sample correction which yields a better

fit to the χ2m for small sample sizes:

Q∗ = T (T + 2)m∑

k=1

ρ2k

T − k−→ χ2

m. (3.6)

Both are called Q-test and well known in the statistics literature. Of course, they are very

useful in applications.

The function in R for the Ljung-Box test is

Box.test(x, lag = 1, type = c("Box-Pierce", "Ljung-Box"))

and the Durbin-Watson test for autocorrelation of disturbances is

dwtest(formula, order.by = NULL, alternative = c("greater","two.sided",

"less"),iterations = 15, exact = NULL, tol = 1e-10, data = list())

3.2 Nonparametric Models with Correlated Errors

See the paper by Xiao, Linton, Carrol and Mammen (2003).

Page 58: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 48

3.3 Predictive Regression Models

see the papers by Stambaugh (1999), Amihud and Hurvich (2004), Torous, Valkanov and

Yan (2004), and Campbell and Yogo (2006) and the references therein.

3.4 Computer Codes

##################################################

# This is Example 3.1 for Johnson and Johnson data

##################################################

y=read.table(’c:/res-teach/xiamen12-06/data/ex3-1.dat’,header=F)

n=length(y[,1])

y_log=log(y[,1]) # log of data

postscript(file="c:\\res-teach\\xiamen12-06\\figs\\fig-3.1.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(1,2),mex=0.4,bg="light yellow")

ts.plot(y,type="l",lty=1,ylab="",xlab="")

title(main="J&J Earnings",cex=0.5)

ts.plot(y_log,type="l",lty=1,ylab="",xlab="")

title(main="transformed log(earnings)",cex=0.5)

dev.off()

# MODEL 1: y_t=beta_0+beta_1 t+ e_t

z1=1:n

fit1=lm(y_log~z1) # fit log(z) versus time trend

e1=fit1$resid

# Now, we need to re-fit the model using the transformed data

x1=5:n

y_1=y_log[5:n]

y_2=y_log[1:(n-4)]

y_fit=y_1-0.7614*y_2

x2=x1-0.7614*(x1-4)

Page 59: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 49

x1=(1-0.7614)*rep(1,n-4)

fit2=lm(y_fit~-1+x1+x2)

e2=fit2$resid

postscript(file="c:\\res-teach\\xiamen12-06\\figs\\fig-3.2.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(2,2),mex=0.4,bg="light pink")

acf(e1, ylab="", xlab="",ylim=c(-0.5,1),lag=30,main="ACF")

text(10,0.8,"detrended")

pacf(e1,ylab="",xlab="",ylim=c(-0.5,1),lag=30,main="PACF")

acf(e2, ylab="", xlab="",ylim=c(-0.5,1),lag=30,main="")

text(15,0.8,"ARIMA(1,0,0,)_4")

pacf(e2,ylab="",xlab="",ylim=c(-0.5,1),lag=30,main="")

dev.off()

#####################################################

# This is Example 3.2 for weekly interest rate series

#####################################################

z<-read.table("c:/res-teach/xiamen12-06/data/ex3-2.txt",header=F)

# first column=one year Treasury constant maturity rate;

# second column=three year Treasury constant maturity rate;

# third column=date

x=z[,1]

y=z[,2]

n=length(x)

u=seq(1962+1/52,by=1/52,length=n)

x_diff=diff(x)

y_diff=diff(y)

# Fit a simple regression model and examine the residuals

fit1=lm(y~x) # Model 1

e1=fit1$resid

Page 60: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 50

postscript(file="c:\\res-teach\\xiamen12-06\\figs\\fig-3.3.eps",

horizontal=F,width=6,height=6)

matplot(u,cbind(x,y),type="l",lty=c(1,2),col=c(1,2),ylab="",xlab="")

dev.off()

postscript(file="c:\\res-teach\\xiamen12-06\\figs\\fig-3.4.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(1,2),mex=0.4,bg="light grey")

plot(x,y,type="p",pch="o",ylab="",xlab="",cex=0.5)

plot(x_diff,y_diff,type="p",pch="o",ylab="",xlab="",cex=0.5)

dev.off()

postscript(file="c:\\res-teach\\xiamen12-06\\figs\\fig-3.5.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(1,2),mex=0.4,bg="light green")

plot(u,e1,type="l",lty=1,ylab="",xlab="")

abline(0,0)

acf(e1,ylab="",xlab="",ylim=c(-0.5,1),lag=30,main="")

dev.off()

# Take different and fit a simple regression again

fit2=lm(y_diff~x_diff) # Model 2

e2=fit2$resid

postscript(file="c:\\res-teach\\xiamen12-06\\figs\\fig-3.6.eps",

horizontal=F,width=6,height=6)

matplot(u[-1],cbind(x_diff,y_diff),type="l",lty=c(1,2),col=c(1,2),

ylab="",xlab="")

abline(0,0)

dev.off()

Page 61: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 51

postscript(file="c:\\res-teach\\xiamen12-06\\figs\\fig-3.7.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(2,2),mex=0.4,bg="light pink")

ts.plot(e2,type="l",lty=1,ylab="",xlab="")

abline(0,0)

acf(e2, ylab="", xlab="",ylim=c(-0.5,1),lag=30,main="")

# fit a model to the differenced data with an MA(1) error

fit3=arima(y_diff,xreg=x_diff, order=c(0,0,1)) # Model 3

e3=fit3$resid

ts.plot(e3,type="l",lty=1,ylab="",xlab="")

abline(0,0)

acf(e3, ylab="",xlab="",ylim=c(-0.5,1),lag=30,main="")

dev.off()

3.5 References

Amihud, Y. and C. Hurvich (2004). Predictive regressions: A reduced-bias estimationmethod. Journal of Financial and Quantitative Analysis, 39, 813-841.

Box, G. and D. Pierce (1970). Distribution of residual autocorrelations in autoregressiveintegrated moving average time series models. Journal of the American StatisticalAssociation, 65, 1509-1526.

Campbell, J. and M. Yogo (2006). Efficient tests of stock return predictability. Journal ofFinancial Economics, 81, 27-60.

Cochrane, D. and G.H. Orcutt (1949). Applications of least squares regression to relation-ships containing autocorrelated errors. Journal of the American Statistical Association,44, 32-61.

Ljung, G. and G. Box (1978). On a measure of lack of fit in time series models. Biometrika,66, 67-72.

Stambaugh, R. (1999). Predictive regressions. Journal of Financial Economics, 54, 375-421.

Torous, W., R. Valkanov and S. Yan (2004). On predicting stock returns with nearlyintegrated explanatory variables. Journal of Business, 77, 937-966.

Page 62: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 3. REGRESSION MODELS WITH CORRELATED ERRORS 52

Xiao, X., O.B. Linton, R.J. Carroll and E. Mammen (2003). More efficient local polyno-mial estimation in nonparametric regression with autocorrelated errors. Journal of theAmerican Statistical Association, 98, 480-992.

Page 63: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

Chapter 4

Estimation of Covariance Matrix

4.1 Methodology

Consider again the regression model in (1.2). There may exist situations which the error et

has serial correlations and/or conditional heteroscedasticity, but the main objective of the

analysis is to make inference concerning the regression coefficients β. When et has serial cor-

relations, we discussed methods in Example 3.1 and Example 3.2 above to overcome this

difficulty. However, we assume that et follows an ARIMA type model and this assumption

might not be always satisfied in some applications. Here, we consider a general situation

without making this assumption. In situations under which the ordinary least squares es-

timates of the coefficients remain consistent, methods are available to provide consistent

estimate of the covariance matrix of the coefficients. Two such methods are widely used in

economics and finance. The first method is called heteroscedasticity consistent (HC) esti-

mator; see Eicker (1967) and White (1980). The second method is called heteroscedasticity

and autocorrelation consistent (HAC) estimator; see Newey and West (1987).

To ease in discussion, we shall re-write the regression model as

yt = βTxt + et,

where yt is the dependent variable, xt = (x1t, · · · , xpt)T is a p-dimensional vector of ex-

planatory variables including constant and lagged variables, and β = (β1, · · · , βp)T is the

parameter vector. The LS estimate of β is given by

β =

[n∑

t=1

xt xTt

]−1 n∑

t=1

xt yt,

53

Page 64: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 4. ESTIMATION OF COVARIANCE MATRIX 54

and the associated covariance matrix has the so-called “sandwich” form as

Σβ = Cov(β) =

[n∑

t=1

xt xTt

]−1

C

[n∑

t=1

xt xTt

]−1

if et is iid= σ2

e

[n∑

t=1

xt xTt

]−1

,

where C is called the “meat” given by

C = Var

(n∑

t=1

et xt

),

σ2e is the variance of et and is estimated by the variance of residuals of the regression. In the

presence of serial correlations or conditional heteroscedasticity, the prior covariance matrix

estimator is inconsistent, often resulting in inflating the t-ratios of β.

The estimator of White (1980) is based on following:

Σβ,hc =

[n∑

t=1

xt xTt

]−1

Chc

[n∑

t=1

xt xTt

]−1

,

where with et = yt − βTxt being the residual at time t,

Chc =n

n− p

n∑

t=1

e2t xt xTt .

The estimator of Newey and West (1987) is

Σβ,hac =

[n∑

t=1

xt xTt

]−1

Chac

[n∑

t=1

xt xTt

]−1

,

where Chac is given by

Chac =n∑

t=1

e2t xt xTt +

l∑

j=1

wj

n∑

t=j+1

xt et et−j xT

t−j + xt−j et−j et xTt

with l is a truncation parameter and wj is weight function such as the Barlett weight function

defined by wj = 1 − j/(l + 1). Other weight function can also used. Newey and West

(1987) showed that if l → ∞ and l4/T → 0, then Chac is a consistent estimator of C.

Newey and West (1987) suggested choosing l to be the integer part of 4(n/100)1/4 and

Newey and West (1994) suggested using some adaptive (data-driven) methods to choose

l; see Newey and West (1994) for details. In general, this estimator essentially can use a

nonparametric method to estimate the covariance matrix of∑n

t=1 et xt and a class of kernel-

based heteroskedasticity and autocorrelation consistent (HAC) covariance matrix estimators

Page 65: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 4. ESTIMATION OF COVARIANCE MATRIX 55

was introduced by Andrews (1991). For example, the Barlett weight wj above can be replaced

by wj = K(j/(l+1)) where K(·) is a kernel function such as truncated kernel K(x) = I(|x| ≤1), the Tukey-Hanning kernel K(x) = (1 + cos(π x))/2 if |x| ≤ 1, the Parzen kernel

K(x) =

1 − 6x2 + 6 |x|3 for 0 ≤ |x| ≤ 1/2,2(1 − |x|)3 for 1/2 ≤ |x| ≤ 1,0 otherwsie,

and the Quadratic spectral kernel

K(x) =25

12π2x2

(sin(6π x/5)

6π x/5− cos(6π x/5)

).

Andrews (1991) suggested using the data-driven method to select the bandwidth l: l =

2.66(α T )1/5 for the Parzen kernel, l = 1.7462(α T )1/5 for the Tukey-Hanning kernel, and

l = 1.3221(α T )1/5 for the quadratic spectral kernel, where

α =4∑k

i=1 ρ2i σ

4i /(1 − ρi)

8

∑ni=1 σ

4i /(1 − ρi)4

with ρi and σi being parameters estimated from an AR(1) model for ut = et xt.

Example 4.1: (Continuation of Example 3.2) For illustration, we consider the first differ-

enced interest rate series in Model II in Example 3.2. The t-ratio of the coefficient of ∆xt

is 104.63 if both serial correlation and conditional heteroscedasticity in residuals are ignored;

it becomes 46.73 when the HC estimator is used, and it reduces to 40.08 when the HAC

estimator is employed.

To use HC or HAC estimator, we can use the package sandwich in R and the commands

are vcovHC() or vcovHAC() or meatHAC(). There are a set of functions implementing

a class of kernel-based heteroskedasticity and autocorrelation consistent (HAC) covariance

matrix estimators as introduced by Andrews (1991). In vcovHC(), these estimators differ in

their choice of the ωi in Ω = Var(e) = diagω1, · · · , ωn, an overview of the most important

cases is given in the following:

const : ωi = σ2

HC0 : ωi = e2i

HC1 : ωi =n

n− ke2i

HC2 : ωi =e2i

1 − hi

Page 66: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 4. ESTIMATION OF COVARIANCE MATRIX 56

HC3 : ωi =e2i

(1 − hi)2

HC4 : ωi =e2i

(1 − hi)δi

where hi = Hii are the diagonal elements of the hat matrix and δi = min4, hi/h.

vcovHC(x, type = c("HC3", "const", "HC", "HC0", "HC1", "HC2", "HC4"),

omega = NULL, sandwich = TRUE, ...)

meatHC(x, type = , omega = NULL)

vcovHAC(x, order.by = NULL, prewhite = FALSE, weights = weightsAndrews,

adjust = TRUE, diagnostics = FALSE, sandwich = TRUE, ar.method = "ols",

data = list(), ...)

meatHAC(x, order.by = NULL, prewhite = FALSE, weights = weightsAndrews,

adjust = TRUE, diagnostics = FALSE, ar.method = "ols", data = list())

kernHAC(x, order.by = NULL, prewhite = 1, bw = bwAndrews,

kernel = c("Quadratic Spectral", "Truncated", "Bartlett", "Parzen",

"Tukey-Hanning"), approx = c("AR(1)", "ARMA(1,1)"), adjust = TRUE,

diagnostics = FALSE, sandwich = TRUE, ar.method = "ols", tol = 1e-7,

data = list(), verbose = FALSE, ...)

weightsAndrews(x, order.by = NULL,bw = bwAndrews,

kernel = c("Quadratic Spectral","Truncated","Bartlett","Parzen",

"Tukey-Hanning"), prewhite = 1, ar.method = "ols", tol = 1e-7,

data = list(), verbose = FALSE, ...)

bwAndrews(x,order.by=NULL,kernel=c("Quadratic Spectral", "Truncated",

"Bartlett","Parzen","Tukey-Hanning"), approx=c("AR(1)", "ARMA(1,1)"),

weights = NULL, prewhite = 1, ar.method = "ols", data = list(), ...)

Page 67: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 4. ESTIMATION OF COVARIANCE MATRIX 57

Also, there are a set of functions implementing the Newey and West (1987, 1994) het-

eroskedasticity and autocorrelation consistent (HAC) covariance matrix estimators.

NeweyWest(x, lag = NULL, order.by = NULL, prewhite = TRUE, adjust = FALSE,

diagnostics = FALSE, sandwich = TRUE, ar.method = "ols", data = list(),

verbose = FALSE)

bwNeweyWest(x, order.by = NULL, kernel = c("Bartlett", "Parzen",

"Quadratic Spectral", "Truncated", "Tukey-Hanning"), weights = NULL,

prewhite = 1, ar.method = "ols", data = list(), ...)

4.2 Details (see the paper by Zeileis)

4.3 Computer Codes

############################################

# This is Example 4.1 of using HC and HAC

##########################################

library(sandwich) # HC and HAC are in the package "sandwich"

library(zoo)

z<-read.table("c:/res-teach/xiamen12-06/data/ex4-1.txt",header=F)

x=z[,1]

y=z[,2]

x_diff=diff(x)

y_diff=diff(y)

# Fit a simple regression model and examine the residuals

fit1=lm(y_diff~x_diff)

print(summary(fit1))

e1=fit1$resid

# Heteroskedasticity-Consistent Covariance Matrix Estimation

#hc0=vcovHC(fit1,type="const")

#print(sqrt(diag(hc0)))

# type=c("const","HC","HC0","HC1","HC2","HC3","HC4")

Page 68: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 4. ESTIMATION OF COVARIANCE MATRIX 58

# HC0 is the White estimator

hc1=vcovHC(fit1,type="HC0")

print(sqrt(diag(hc1)))

#Heteroskedasticity and autocorrelation consistent (HAC) estimation

#of the covariance matrix of the coefficient estimates in a

#(generalized) linear regression model.

hac1=vcovHAC(fit1,sandwich=T)

print(sqrt(diag(hac1)))

4.4 References

Andrews, D.W.K. (1991). Heteroskedasticity and autocorrelation consistent covariancematrix estimation. Econometrica, 59, 817-858.

Eicker, F. (1967). Limit theorems for regression with unequal and dependent errors. InProceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability(L. LeCam and J. Neyman, eds.), University of California Press, Berkeley.

Newey, W.K. and K.D. West (1987). A simple, positive-definite, heteroskedasticity andautocorrelation consistent covariance matrix. Econometrica, 55, 703-708.

Newey, W.K. and K.D. West (1994). Automatic lag selection in covariance matrix estima-tion. Review of Economic Studies, 61, 631-653.

White, H. (1980). A Heteroskedasticity consistent covariance matrix and a direct test forheteroskedasticity. Econometrica, 48, 817-838.

Zeileis, A. (2004). Econometric computing with HC and HAC covariance matrix estimators.Journal of Statistical Software, Volume 11, Issue 10.

Zeileis, A. (2006). Object-oriented computation of sandwich estimators. Journal of Statis-tical Software, 16, 1-16.

Page 69: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

Chapter 5

Seasonal Time Series Models

5.1 Characteristics of Seasonality

When time series (particularly, economic and financial time series) are observed each day

or month or quarter, it is often the case that such as a series displays a seasonal pattern

(deterministic cyclical behavior). Similar to the feature of trend, there is no precise definition

of seasonality. Usually we refer to seasonality when observations in certain seasons display

strikingly different features to other seasons. For example, when the retail sales are always

large in the fourth quarter (because of the Christmas spending) and small in the first quarter

as can be observed from Figure 5.1. It may also be possible that seasonality is reflected in the

variance of a time series. For example, for daily observed stock market returns the volatility

seems often highest on Mondays, basically because investors have to digest three days of

news instead of only day. For mode details, see the book by Taylor (2005, §4.5).

Example 5.1: For Example 3.1, the data shown in Figure 3.1 represent quarterly earnings

per share for the American Company Johnson & Johnson from the from the fourth quarter

of 1970 to the first quarter of 1980. It is easy to note some very nonstationary behavior in

this series that cannot be eliminated completely by differencing or detrending because of the

larger fluctuations that occur near the end of the record when the earnings are higher. The

right panel of Figure 3.1 shows the log-transformed series and we note that the latter peaks

have been attenuated so that the variance of the transformed series seems more stable. One

would have to eliminate the trend still remaining in the above series to obtain stationarity.

For more details on the current analyses of this series, see the later analyses and the papers

by Burman and Shumway (1998) and Cai and Chen (2006).

59

Page 70: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 60

Example 5.2: In this example we consider the monthly US retail sales series (not seasonally

adjusted) from January of 1967 to December of 2000 (in billions of US dollars). The data

can be downloaded from the web site at http://marketvector.com. The U.S. retail sales index

0 100 200 300 400

50000

100000

200000

300000

Figure 5.1: US Retail Sales Data from 1967-2000.

is one of the most important indicators of the US economy. There are vast studies of the

seasonal series (like this series) in the literature; see, e.g., Franses (1996, 1998) and Ghysels

and Osborn (2001) and Cai and Chen (2006). From Figure 5.1, we can observe that the

peaks occur in December and we can say that retail sales display seasonality. Also, it can be

observed that the trend is basically increasing but nonlinearly. The same phenomenon can

be observed from Figure 3.1 for the quarterly earnings for Johnson & Johnson.

If simple graphs are not informative enough to highlight possible seasonal variation, a

formal regression model can be used, for example, one might try to consider the following

regression model with seasonal dummy variables

∆ yt = yt − yt−1 =s∑

j=1

βj Dj,t + εt,

where Dj,t is a seasonal dummy variable and s is the number of seasons. Of course, one

can use a seasonal ARIMA model, denoted by ARIMA(p, d, q)× (P,D,Q)s, which will be

discussed later.

Example 5.3: In this example, we consider a time series with pronounced seasonality

displayed in Figure 5.2, where logs of four-weekly advertising expenditures on ratio and

television in The Netherlands for 1978.01−1994.13. For these two marketing time series one

Page 71: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 61

0 50 100 150 200

89

1011

television

radio

Figure 5.2: Four-weekly advertising expenditures on radio and television in The Netherlands,1978.01 − 1994.13.

can observe clearly that the television advertising displays quite some seasonal fluctuation

throughout the entire sample and the radio advertising has seasonality only for the last five

years. Also, there seems to be a structural break in the radio series around observation

53. This break is related to an increase in radio broadcasting minutes in January 1982.

Furthermore, there is a visual evidence that the trend changes over time.

Generally, it appears that many time series seasonally observed from business and eco-

nomics as well as other applied fields display seasonality in the sense that the observations in

certain seasons have properties that differ from those data points in other seasons. A second

feature of many seasonal time series is that the seasonality changes over time, like what

studied by Cai and Chen (2006). Sometimes, these changes appear abrupt, as is the case for

advertising on the radio in Figure 5.2, and sometimes such changes occur only slowly. To

capture these phenomena, Cai and Chen (2006) proposed a more general flexible seasonal

effect model having the following form:

yij = α(ti) + βj(ti) + eij, i = 1, . . . , n, j = 1, . . . , s,

where yij = y(i−1)s+j, ti = i/n, α(·) is a (smooth) common trend function in [0, 1], βj(·)are (smooth) seasonal effect functions in [0, 1], either fixed or random, subject to a set of

constraints, and the error term eij is assumed to be stationary. For more details, see Cai

and Chen (2006).

Page 72: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 62

5.2 Modeling

Some economic and financial as well as environmental time series such as quarterly earning

per share of a company exhibits certain cyclical or periodic behavior; see the later chapters

on more discussions on cycles and periodicity. Such a time series is called a seasonal

(deterministic cycle) time series. Figure 3.1 shows the time plot of quarterly earning per share

of Johnson and Johnson from the first quarter of 1960 to the last quarter of 1980. The data

possess some special characteristics. In particular, the earning grew exponentially during

the sample period and had a strong seasonality. Furthermore, the variability of earning

increased over time. The cyclical pattern repeats itself every year so that the periodicity of

the series is 4. If monthly data are considered (e.g., monthly sales of Wal-Mart Stores), then

the periodicity is 12. Seasonal time series models are also useful in pricing weather-related

derivatives and energy futures.

Analysis of seasonal time series has a long history. In some applications, seasonality is of

secondary importance and is removed from the data, resulting in a seasonally adjusted time

series that is then used to make inference. The procedure to remove seasonality from a time

series is referred to as seasonal adjustment. Most economic data published by the U.S.

government are seasonally adjusted (e.g., the growth rate of domestic gross product and the

unemployment rate). In other applications such as forecasting, seasonality is as important

as other characteristics of the data and must be handled accordingly. Because forecasting

is a major objective of economic and financial time series analysis, we focus on the latter

approach and discuss some econometric models that are useful in modeling seasonal time

series.

When the autoregressive, differencing, or seasonal moving average behavior seems to

occur at multiples of some underlying period s, a seasonal ARIMA series may result. The

seasonal nonstationarity is characterized by slow decay at multiples of s and can often be

eliminated by a seasonal differencing operator of the form ∆Ds xt = (1 − Ls)D xt. For

example, when we have monthly data, it is reasonable that a yearly phenomenon will induce

s = 12 and the ACF will be characterized by slowly decaying spikes at 12, 24, 36, 48, · · ·, and

we can obtain a stationary series by transforming with the operator (1−L12)xt = xt −xt−12

which is the difference between the current month and the value one year or 12 months ago.

If the autoregressive or moving average behavior is seasonal at period s, we define formally

Page 73: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 63

the operators

Φ(Ls) = 1 − Φ1 Ls − Φ2 L

2s − · · · − ΦP LPs (5.1)

and

Θ(Ls) = 1 − Θ1 Ls − Θ2 L

2s − · · · − ΘQ LQs. (5.2)

The final form of the seasonal ARIMA(p, d, q) × (P,D,Q)s model is

Φ(Ls)φ(L)∆Ds ∆d xt = Θ(Ls) θ(L)wt. (5.3)

Note that one special model of (5.3) is ARIMA(0, 1, 1) × (0, 1, 1)s, that is

(1 − Ls)(1 − L)xt = (1 − θ1 L)(1 − Θ1 Ls)wt.

This model is referred to as the airline model or multiplicative seasonal model in the

literature; see Box and Jenkins (1970), Box, Jenkins, and Reinsel (1994, Chapter 9), and

Brockwell and Davis (1991). It has been found to be widely applicable in modeling seasonal

time series. The AR part of the model simply consists of the regular and seasonal differences,

whereas the MA part involves two parameters.

We may also note the properties below corresponding to Properties 5.1 - 5.3.

Property 5.1: The ACF of a seasonally non-stationary time series decays very slowly at

lag multiples s, 2s, 3s, · · ·, with zeros in between, where s denotes a seasonal period, usually

4 for quarterly data or 12 for monthly data. The PACF of a non-stationary time series tends

to have a peak very near unity at lag s.

Property 5.2: For a seasonal autoregressive series of order P , the partial autocorrelation

function Φhh as a function of lag h has nonzero values at s, 2s, 3s, · · ·, Ps, with zeros in

between, and is zero for h > Ps, the order of the seasonal autoregressive process. There

should be some exponential decay.

Property 5.3: For a seasonal moving average series of orderQ, note that the autocorrelation

function (ACF) has nonzero values at s, 2s, 3s, · · ·, Qs and is zero for h > Qs.

Remark: Note that there is a build-in command in R called arima() which is a powerful

tool for estimating and making inference for an ARIMA model. The command is

Page 74: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 64

arima(x,order=c(0,0,0),seasonal=list(order=c(0,0,0),period=NA),

xreg=NULL,include.mean=TRUE, transform.pars=TRUE,fixed=NULL,init=NULL,

method=c("CSS-ML","ML","CSS"),n.cond,optim.control=list(),kappa=1e6)

See the manuals of R for details about this commend.

Example 5.4: We illustrate by fitting the monthly birth series from 1948-1979 shown in

Figure 5.3. The period encompasses the boom that followed the Second World War and there

0 100 200 300

250

300

350

400

Births

0 100 200 300

−40

−20

020

40

First difference

0 50 100 200 300

−40

−20

020

40 ARIMA(0,1,0)X(0,1,0)_12

0 100 200 300

−40

−20

020

40 ARIMA(0,1,1)X(0,1,1)_12

Figure 5.3: Number of live births 1948(1) − 1979(1) and residuals from models with a firstdifference, a first difference and a seasonal difference of order 12 and a fitted ARIMA(0, 1, 1)×(0, 1, 1)12 model.

is the expected rise which persists for about 13 years followed by a decline to around 1974.

The series appears to have long-term swings, with seasonal effects superimposed. The long-

term swings indicate possible non-stationarity and we verify that this is the case by checking

the ACF and PACF shown in the top panel of Figure 5.4. Note that by Property 5.1,

slow decay of the ACF indicates non-stationarity and we respond by taking a first difference.

The results shown in the second panel of Figure 2.5 indicate that the first difference has

eliminated the strong low frequency swing. The ACF, shown in the second panel from the

top in Figure 5.4 shows peaks at 12, 24, 36, 48, · · ·, with now decay. This behavior implies

Page 75: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 65

0 10 20 30 40 50 60−0

.50

.5ACF

0 10 20 30 40 50 60−0

.50

.5

PACF

data

0 10 20 30 40 50 60−0

.50

.5

0 10 20 30 40 50 60−0

.50

.5

ARIMA(0,1,0)

0 10 20 30 40 50 60−0

.50

.5

0 10 20 30 40 50 60−0

.50

.5

ARIMA(0,1,0)X(0,1,0)_12

0 10 20 30 40 50 60−0

.50

.5

0 10 20 30 40 50 60−0

.50

.5

ARIMA(0,1,0)X(0,1,1)_12

0 10 20 30 40 50 60−0

.50

.5

0 10 20 30 40 50 60−0

.50

.5

ARIMA(0,1,1)X(0,1,1)_12

Figure 5.4: Autocorrelation functions and partial autocorrelation functions for the birthseries (top two panels), the first difference (second two panels) an ARIMA(0, 1, 0)×(0, 1, 1)12

model (third two panels) and an ARIMA(0, 1, 1) × (0, 1, 1)12 model (last two panels).

Page 76: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 66

seasonal non-stationarity, by Property 5.1 above, with s = 12. A seasonal difference of

the first difference generates an ACF and PACF in Figure 5.4 that we expect for stationary

series.

Taking the seasonal difference of the first difference gives a series that looks stationary

and has an ACF with peaks at 1 and 12 and a PACF with a substantial peak at 12 and

lesser peaks at 12, 24, · · ·. This suggests trying either a first order moving average term,

or a first order seasonal moving average term with s = 12, by Property 5.3 above. We

choose to eliminate the largest peak first by applying a first-order seasonal moving average

model with s = 12. The ACF and PACF of the residual series from this model, i.e. from

ARIMA(0, 1, 0)× (0, 1, 1)12, written as (1−L)(1−L12)xt = (1−Θ1 L12)wt, is shown in the

fourth panel from the top in Figure 5.4. We note that the peak at lag one is still there, with

attending exponential decay in the PACF. This can be eliminated by fitting a first-order

moving average term and we consider the model ARIMA(0, 1, 1) × (0, 1, 1)12, written as

(1 − L)(1 − L12)xt = (1 − θ1 L)(1 − Θ1 L12)wt.

The ACF of the residuals from this model are relatively well behaved with a number of peaks

either near or exceeding the 95% test of no correlation. Fitting this final ARIMA(0, 1, 1) ×(0, 1, 1)12 model leads to the model

(1 − L)(1 − L12)xt = (1 − 0.4896L)(1 − 0.6844L12)wt

with AICC= 4.95, R2 = 0.98042 = 0.961, and the p-values are (0.000, 0.000). The ARIMA

search leads to the model

(1 − L)(1 − L12)xt = (1 − 0.4088L− 0.1645L2)(1 − 0.6990L12)wt,

yielding AICC= 4.92 and R2 = 0.9812 = 0.962, slightly better than the ARIMA(0, 1, 1) ×(0, 1, 1)12 model. Evaluating these latter models leads to the conclusion that the extra

parameters do not add a practically substantial amount to the predictability. The model is

expanded as

xt = xt−1 + xt−12 − xt−13 + wt − θ1wt−1 − Θ1wt−12 + θ1Θ1wt−13.

The forecast is

xtt+1 = xt + xt−11 − xt−12 − θ1wt − Θ1wt−11 + θ1 Θ1wt−12

Page 77: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 67

xtt+2 = xt

t+1 + xt−10 − xt−11 − Θ1wt−10 + θ1Θ1wt−11.

Continuing in the same manner, we obtain

xtt+12 = xt

t+11 + xt − xt−1 − Θ1wt + θ1Θ1wt−1

for the 12 month forecast.

Example 5.5: Figure 5.5 shows the autocorrelation function of the log-transformed J&J

earnings series that is plotted in Figure 3.1 and we note the slow decay indicating the

nonstationarity which has already been obvious in the Chapter 3 discussion. We may also

compare the ACF with that of a random walk, and note the close similarity. The partial

autocorrelation function is very high at lag one which, under ordinary circumstances, would

indicate a first order autoregressive AR(1) model, except that, in this case, the value is

close to unity, indicating a root close to 1 on the unit circle. The only question would be

whether differencing or detrending is the better transformation to stationarity. Following

in the Box-Jenkins tradition, differencing leads to the ACF and PACF shown in the second

panel and no simple structure is apparent. To force a next step, we interpret the peaks at 4,

8, 12, 16, · · ·, as contributing to a possible seasonal autoregressive term, leading to a possible

ARIMA(0, 1, 0) × (1, 0, 0)4 and we simply fit this model and look at the ACF and PACF of

the residuals, shown in the third two panels. The fit improves somewhat, with significant

peaks still remaining at lag 1 in both the ACF and PACF. The peak in the ACF seems more

isolated and there remains some exponentially decaying behavior in the PACF, so we try a

model with a first-order moving average. The bottom two panels show the ACF and PACF

of the resulting ARIMA(0, 1, 1)×(1, 0, 0)4 and we note only relatively minor excursions above

and below the 95% intervals under the assumption that the theoretical ACF is white noise.

The final model suggested is (yt = log xt)

(1 − Φ1 L4)(1 − L) yt = (1 − θ1 L)wt, (5.4)

where Φ1 = 0.820(0.058), θ1 = 0.508(0.098), and σ2w = 0.0086. The model can be written in

forecast form as

yt = yt−1 + Φ1(yt−4 − yt−5) + wt − θ1wt−1.

The residual plot of the above is plotted in the left bottom panel of Figure 5.6. To forecast

the original series for, say 4 quarters, we compute the forecast limits for yt = log xt and then

exponentiate, i.e. xtt+h = exp(yt

t+h).

Page 78: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 68

0 5 10 15 20 25 30−0.

50.

00.

51.

0

ACF

log(J&J)

0 5 10 15 20 25 30−0.

50.

00.

51.

0

PACF

0 5 10 15 20 25 30−0.

50.

00.

51.

0

First Difference

0 5 10 15 20 25 30−0.

50.

00.

51.

0

0 5 10 15 20 25 30−0.

50.

00.

51.

0

ARIMA(0,1,0)X(1,0,0,)_4

0 5 10 15 20 25 30−0.

50.

00.

51.

0

0 5 10 15 20 25 30−0.

50.

00.

51.

0

ARIMA(0,1,1)X(1,0,0,)_4

0 5 10 15 20 25 30−0.

50.

00.

51.

0

Figure 5.5: Autocorrelation functions (ACF) and partial autocorrelation functions (PACF)for the log J&J earnings series (top two panels), the first difference (second two panels),ARIMA(0, 1, 0) × (1, 0, 0)4 model (third two panels), and ARIMA(0, 1, 1) × (1, 0, 0)4 model(last two panels).

Based on the the exact likelihood method, Tsay (2005) considered the following seasonal

ARIMA(0, 1, 1) × (0, 1, 1)4 model

(1 − L)(1 − L4) yt = (1 − 0.678L)(1 − 0.314L4)wt, (5.5)

with σ2w = 0.089, where standard errors of the two MA parameters are 0.080 and 0.101,

respectively. The Ljung-Box statistics of the residuals show Q(12) = 10.0 with p-value 0.44.

The model appears to be adequate. The ACF and PACF of the ARIMA(0, 1, 1) × (0, 1, 1)4

Page 79: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 69

0 5 10 15 20 25 30−0.5

0.00.5

1.0

ACF

ARIMA(0,1,1)X(0,1,1,)_4

0 5 10 15 20 25 30−0.5

0.00.5

1.0

PACF

0 20 40 60 80

−0.2

−0.1

0.00.1

0.2

Residual Plot

ARIMA(0,1,1)X(1,0,0,)_4

0 20 40 60 80−0

.10.0

0.10.2

Residual Plot

ARIMA(0,1,1)X(0,1,1,)_4

Figure 5.6: ACF and PACF for ARIMA(0, 1, 1) × (0, 1, 1)4 model (top two panels) and theresidual plots of ARIMA(0, 1, 1)×(1, 0, 0)4 (left bottom panel) and ARIMA(0, 1, 1)×(0, 1, 1)4

model (right bottom panel).

model are given in the top two panels of Figure 5.6 and the residual plot is displayed in

the right bottom panel of Figure 5.6. Based on the comparison of ACF and PACF of two

model (5.4) and (5.5) [the last two panels of Figure 5.5 and the top two panels in Figure

5.6], it seems that ARIMA(0, 1, 1) × (0, 1, 1)4 model in (5.5) might perform better than

ARIMA(0, 1, 1) × (1, 0, 0)4 model in (5.4).

To illustrate the forecasting performance of the seasonal model in (5.5), we re-estimate

the model using the first 76 observations and reserve the last eight data points for fore-

casting evaluation. We compute 1-step to 8-step ahead forecasts and their standard errors

of the fitted model at the forecast origin t = 76. An anti-log transformation is taken to

obtain forecasts of earning per share using the relationship between normal and log-normal

distributions. Figure 2.15 in Tsay (2005, p.77) shows the forecast performance of the model,

where the observed data are in solid line, point forecasts are shown by dots, and the dashed

lines show 95% interval forecasts. The forecasts show a strong seasonal pattern and are close

to the observed data. For more comparisons for forecasts using different models including

semiparametric and nonparametric models, the reader is referred to the book by Shumway

(1988), and Shumway and Stoffer (2000) and the papers by Burman and Shummay (1998)

Page 80: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 70

and Cai and Chen (2006).

When the seasonal pattern of a time series is stable over time (e.g., close to a deterministic

function), dummy variables may be used to handle the seasonality. This approach is taken

by some analysts. However, deterministic seasonality is a special case of the multiplicative

seasonal model discussed before. Specifically, if Θ1 = 1, then model contains a deterministic

seasonal component. Consequently, the same forecasts are obtained by using either dummy

variables or a multiplicative seasonal model when the seasonal pattern is deterministic. Yet

use of dummy variables can lead to inferior forecasts if the seasonal pattern is not deter-

ministic. In practice, we recommend that the exact likelihood method should be used to

estimate a multiplicative seasonal model, especially when the sample size is small or when

there is the possibility of having a deterministic seasonal component.

Example 5.6: To determine deterministic behavior, consider the monthly simple return of

the CRSP Decile 1 index from January 1960 to December 2003 for 528 observations. The

series is shown in the left top panel of Figure 5.7 and the time series does not show any

clear pattern of seasonality. However, the sample ACf of the return series shown in the left

0 100 200 300 400 500

−0.2

0.00.2

0.4

Simple Returns

0 100 200 300 400 500−0.3

−0.1

0.10.2

0.30.4

January−adjusted returns

0 10 20 30 40−0.5

0.00.5

1.0

ACF

0 10 20 30 40−0.5

0.00.5

1.0

ACF

Figure 5.7: Monthly simple return of CRSP Decile 1 index from January 1960 to December2003: Time series plot of the simple return (left top panel), time series plot of the simplereturn after adjusting for January effect (right top panel), the ACF of the simple return (leftbottom panel), and the ACF of the adjusted simple return.

Page 81: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 71

bottom panel of Figure 5.7 contains significant lags at 12, 24, and 36 as well as lag 1. If

seasonal AIMA models are entertained, a model in form

(1 − φ1 L)(1 − Φ1 L12)xt = α+ (1 − Θ1 L

12)wt

is identified, where xt is the monthly simple return. Using the conditional likelihood, the

fitted model is

(1 − 0.25L)(1 − 0.99L12)xt = 0.0004 + (1 − 0.92L12)wt

with σw = 0.071. The MA coefficient is close to unity, indicating that the fitted model is

close to being non-invertible. If the exact likelihood method is used, we have

(1 − 0.264L)(1 − 0.996L12)xt = 0.0002 + (1 − 0.999L12)wt

with σw = 0.067. Cancellation between seasonal AR and MA factors is clearly. This high-

lights the usefulness of using the exact likelihood method, and the estimation result suggests

that the seasonal behavior might be deterministic. To further confirm this assertion, we

define the dummy variable for January, that is

Jt =

1 if t is January,0 otherwise,

and employ the simple linear regression

xt = β0 + β1 Jt + et.

The right panels of Figure 5.7 show the time series plot of and the ACF of the residual series

of the prior simple linear regression. From the ACF, there are no significant serial correlation

at any multiples of 12, suggesting that the seasonal pattern has been successfully removed

by the January dummy variable. Consequently, the seasonal behavior in the monthly simple

return of Decile 1 is due to the January effect.

5.3 Nonlinear Seasonal Time Series Models

See the papers by Burman and Shumway (1998) and Cai and Chen (2006) and the books

by Franses (1998) and Ghysels and Osborn (2001). The reading materials are the papers by

Burman and Shumway (1998) and Cai and Chen (2006).

Page 82: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 72

5.4 Computer Codes

##########################################

# This is Example 5.2 for retail sales data

#########################################

y=read.table("c:/res-teach/xiamen12-06/data/ex5-2.txt",header=F)

postscript(file="c:/res-teach/xiamen12-06/figs/fig-5.1.eps",

horizontal=F,width=6,height=6)

ts.plot(y,type="l",lty=1,ylab="",xlab="")

dev.off()

############################################

# This is Example 5.3 for the marketing data

############################################

text_tv=c("television")

text_radio=c("radio")

data<-read.table("c:/res-teach/xiamen12-06/data/ex5-3.txt",header=T)

TV=log(data[,1])

RADIO=log(data[,2])

postscript(file="c:/res-teach/xiamen12-06/figs/fig-5.2.eps",

horizontal=F,width=6,height=6)

ts.plot(cbind(TV,RADIO),type="l",lty=c(1,2),col=c(1,2),ylab="",xlab="")

text(20,10.5,text_tv)

text(165,8,text_radio)

dev.off()

####################################################################

# This is Example 5.4 in Chapter 2

####################################

x<-matrix(scan("c:/res-teach/xiamen12-06/data/ex5-4.txt"),byrow=T,ncol=1)

n=length(x)

x_diff=diff(x)

x_diff_12=diff(x_diff,lag=12)

Page 83: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 73

fit1=arima(x,order=c(0,0,0),seasonal=list(order=c(0,0,0)),include.mean=F)

resid_1=fit1$resid

fit2=arima(x,order=c(0,1,0),seasonal=list(order=c(0,0,0)),include.mean=F)

resid_2=fit2$resid

fit3=arima(x,order=c(0,1,0),seasonal=list(order=c(0,1,0),period=12),

include.mean=F)

resid_3=fit3$resid

postscript(file="c:/res-teach/xiamen12-06/figs/fig-5.4.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(5,2),mex=0.4,bg="light pink")

acf(resid_1, ylab="", xlab="",ylim=c(-0.5,1),lag=60,main="ACF",cex=0.7)

pacf(resid_1,ylab="",xlab="",ylim=c(-0.5,1),lag=60,main="PACF",cex=0.7)

text(20,0.7,"data",cex=1.2)

acf(resid_2, ylab="", xlab="",ylim=c(-0.5,1),lag=60,main="")

# differenced data

pacf(resid_2,ylab="",xlab="",ylim=c(-0.5,1),lag=60,main="")

text(30,0.7,"ARIMA(0,1,0)")

acf(resid_3, ylab="", xlab="",ylim=c(-0.5,1),lag=60,main="")

# seasonal difference of differenced data

pacf(resid_3,ylab="",xlab="",ylim=c(-0.5,1),lag=60,main="")

text(30,0.7,"ARIMA(0,1,0)X(0,1,0)_12",cex=0.8)

fit4=arima(x,order=c(0,1,0),seasonal=list(order=c(0,1,1),

period=12),include.mean=F)

resid_4=fit4$resid

fit5=arima(x,order=c(0,1,1),seasonal=list(order=c(0,1,1),

period=12),include.mean=F)

resid_5=fit5$resid

acf(resid_4, ylab="", xlab="",ylim=c(-0.5,1),lag=60,main="")

Page 84: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 74

# ARIMA(0,1,0)*(0,1,1)_12

pacf(resid_4,ylab="",xlab="",ylim=c(-0.5,1),lag=60,main="")

text(30,0.7,"ARIMA(0,1,0)X(0,1,1)_12",cex=0.8)

acf(resid_5, ylab="", xlab="",ylim=c(-0.5,1),lag=60,main="")

# ARIMA(0,1,1)*(0,1,1)_12

pacf(resid_5,ylab="",xlab="",ylim=c(-0.5,1),lag=60,main="")

text(30,0.7,"ARIMA(0,1,1)X(0,1,1)_12",cex=0.8)

dev.off()

postscript(file="c:/res-teach/xiamen12-06/figs/fig-5.3.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(2,2),mex=0.4,bg="light blue")

ts.plot(x,type="l",lty=1,ylab="",xlab="")

text(250,375, "Births")

ts.plot(x_diff,type="l",lty=1,ylab="",xlab="",ylim=c(-50,50))

text(255,45, "First difference")

abline(0,0)

ts.plot(x_diff_12,type="l",lty=1,ylab="",xlab="",ylim=c(-50,50))

# time series plot of the seasonal difference (s=12) of differenced data

text(225,40,"ARIMA(0,1,0)X(0,1,0)_12")

abline(0,0)

ts.plot(resid_5,type="l",lty=1,ylab="",xlab="",ylim=c(-50,50))

text(225,40, "ARIMA(0,1,1)X(0,1,1)_12")

abline(0,0)

dev.off()

########################

# This is Example 5.5

########################

y=read.table(’c:/res-teach/xiamen12-06/data/ex3-1.txt’,header=F)

n=length(y[,1])

Page 85: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 75

y_log=log(y[,1]) # log of data

y_diff=diff(y_log) # first-order difference

y_diff_4=diff(y_diff,lag=4) # first-order seasonal difference

fit1=ar(y_log,order=1) # fit AR(1) model

#print(fit1)

library(tseries) # call library(tseries)

library(zoo)

fit1_test=adf.test(y_log)

# do Augmented Dicky-Fuller test for tesing unit root

#print(fit1_test)

fit1=arima(y_log,order=c(0,0,0),seasonal=list(order=c(0,0,0)),

include.mean=F)

resid_21=fit1$resid

fit2=arima(y_log,order=c(0,1,0),seasonal=list(order=c(0,0,0)),

include.mean=F)

resid_22=fit2$resid # residual for ARIMA(0,1,0)*(0,0,0)

fit3=arima(y_log,order=c(0,1,0),seasonal=list(order=c(1,0,0),period=4),

include.mean=F,method=c("CSS"))

resid_23=fit3$resid # residual for ARIMA(0,1,0)*(1,0,0)_4

# note that this model is non-stationary so that "CSS" is used

postscript(file="c:\\res-teach\\xiamen12-06\\figs\\fig-5.5.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(4,2),mex=0.4,bg="light green")

acf(resid_21, ylab="", xlab="",ylim=c(-0.5,1),lag=30,main="ACF",cex=0.7)

text(16,0.8,"log(J&J)")

pacf(resid_21,ylab="",xlab="",ylim=c(-0.5,1),lag=30,main="PACF",cex=0.7)

acf(resid_22, ylab="", xlab="",ylim=c(-0.5,1),lag=30,main="")

text(16,0.8,"First Difference")

pacf(resid_22,ylab="",xlab="",ylim=c(-0.5,1),lag=30,main="")

Page 86: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 76

acf(resid_23, ylab="", xlab="",ylim=c(-0.5,1),lag=30,main="")

text(16,0.8,"ARIMA(0,1,0)X(1,0,0,)_4",cex=0.8)

pacf(resid_23,ylab="",xlab="",ylim=c(-0.5,1),lag=30,main="")

fit4=arima(y_log,order=c(0,1,1),seasonal=list(order=c(1,0,0),

period=4),include.mean=F,method=c("CSS"))

resid_24=fit4$resid # residual for ARIMA(0,1,1)*(1,0,0)_4

# note that this model is non-stationary

#print(fit4)

fit4_test=Box.test(resid_24,lag=12, type=c("Ljung-Box"))

#print(fit4_test)

acf(resid_24, ylab="", xlab="",ylim=c(-0.5,1),lag=30,main="")

text(16,0.8,"ARIMA(0,1,1)X(1,0,0,)_4",cex=0.8)

# ARIMA(0,1,1)*(1,0,0)_4

pacf(resid_24,ylab="",xlab="",ylim=c(-0.5,1),lag=30,main="")

dev.off()

fit5=arima(y_log,order=c(0,1,1),seasonal=list(order=c(0,1,1),period=4),

include.mean=F,method=c("ML"))

resid_25=fit5$resid # residual for ARIMA(0,1,1)*(0,1,1)_4

#print(fit5)

fit5_test=Box.test(resid_25,lag=12, type=c("Ljung-Box"))

#print(fit5_test)

postscript(file="c:\\res-teach\\xiamen12-06\\figs\\fig-5.6.eps",

horizontal=F,width=6,height=6,bg="light grey")

par(mfrow=c(2,2),mex=0.4)

acf(resid_25, ylab="", xlab="",ylim=c(-0.5,1),lag=30,main="ACF")

text(16,0.8,"ARIMA(0,1,1)X(0,1,1,)_4",cex=0.8)

# ARIMA(0,1,1)*(0,1,1)_4

pacf(resid_25,ylab="",xlab="",ylim=c(-0.5,1),lag=30,main="PACF")

ts.plot(resid_24,type="l",lty=1,ylab="",xlab="")

title(main="Residual Plot",cex=0.5)

Page 87: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 77

text(40,0.2,"ARIMA(0,1,1)X(1,0,0,)_4",cex=0.8)

abline(0,0)

ts.plot(resid_25,type="l",lty=1,ylab="",xlab="")

title(main="Residual Plot",cex=0.5)

text(40,0.18,"ARIMA(0,1,1)X(0,1,1,)_4",cex=0.8)

abline(0,0)

dev.off()

#########################

# This is Example 5.6

########################

z<-matrix(scan("c:/res-teach/xiamen12-06/data/ex5-6.txt"),byrow=T,ncol=4)

decile1=z[,2]

# Model 1: an ARIMA(1,0,0)*(1,0,1)_12

fit1=arima(decile1,order=c(1,0,0),seasonal=list(order=c(1,0,1),

period=12),include.mean=T)

#print(fit1)

e1=fit1$resid

n=length(decile1)

m=n/12

jan=rep(c(1,0,0,0,0,0,0,0,0,0,0,0),m)

feb=rep(c(0,1,0,0,0,0,0,0,0,0,0,0),m)

mar=rep(c(0,0,1,0,0,0,0,0,0,0,0,0),m)

apr=rep(c(0,0,0,1,0,0,0,0,0,0,0,0),m)

may=rep(c(0,0,0,0,1,0,0,0,0,0,0,0),m)

jun=rep(c(0,0,0,0,0,1,0,0,0,0,0,0),m)

jul=rep(c(0,0,0,0,0,0,1,0,0,0,0,0),m)

aug=rep(c(0,0,0,0,0,0,0,1,0,0,0,0),m)

sep=rep(c(0,0,0,0,0,0,0,0,1,0,0,0),m)

oct=rep(c(0,0,0,0,0,0,0,0,0,1,0,0),m)

Page 88: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 78

nov=rep(c(0,0,0,0,0,0,0,0,0,0,1,0),m)

dec=rep(c(0,0,0,0,0,0,0,0,0,0,0,1),m)

de=cbind(decile1[jan==1],decile1[feb==1],decile1[mar==1],decile1[apr==1],

decile1[may==1],decile1[jun==1],decile1[jul==1],decile1[aug==1],

decile1[sep==1],decile1[oct==1],decile1[nov==1],decile1[dec==1])

# Model 2: a simple regression model without correlated errors

# to see the effect from January

fit2=lm(decile1~jan)

e2=fit2$resid

#print(summary(fit2))

# Model 3: a regression model with correlated errors

fit3=arima(decile1,xreg=jan,order=c(0,0,1),include.mean=T)

e3=fit3$resid

#print(fit3)

postscript(file="c:/res-teach/xiamen12-06/figs/fig-5.7.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(2,2),mex=0.4,bg="light yellow")

ts.plot(decile1,type="l",lty=1,col=1,ylab="",xlab="")

title(main="Simple Returns",cex=0.5)

abline(0,0)

ts.plot(e3,type="l",lty=1,col=1,ylab="",xlab="")

title(main="January-adjusted returns",cex=0.5)

abline(0,0)

acf(decile1, ylab="", xlab="",ylim=c(-0.5,1),lag=40,main="ACF")

acf(e3,ylab="",xlab="",ylim=c(-0.5,1),lag=40,main="ACF")

dev.off()

5.5 References

Box, G.E.P. and Jenkins, G.M. (1970). Time Series Analysis, Forecasting, and Control.Holden Day, San Francisco.

Page 89: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 5. SEASONAL TIME SERIES MODELS 79

Box, G.E.P., G.M. Jenkins and G.C. Reinsel (1994). Time Series Analysis, Forecasting andControl. 3th Edn. Englewood Cliffs, NJ: Prentice-Hall.

Brockwell, P.J. and Davis, R.A. (1991). Time Series Theory and Methods. New York:Springer.

Burman, P. and R.H. Shumway (1998). Semiparametric modeling of seasonal time series.Journal of Time Series Analysis, 19, 127-145.

Cai, Z. and R. Chen (2006). Flexible seasonal time series models. Advances in Economet-rics, 20B, 63-87.

Franses, P.H. (1998). Time Series Models for Business and Economic Forecasting. NewYork: Cambridge University Press.

Franses, P.H. and D. van Dijk (2000). Nonlinear Time Series Models for Empirical Finance.New York: Cambridge University Press.

Ghysels, E. and D.R. Osborn (2001). The Econometric Analysis of Seasonal Time Series.New York: Cambridge University Press.

Shumway, R.H. (1988). Applied Statistical Time Series Analysis. Englewood Cliffs, NJ:Prentice-Hall.

Shumway, R.H., A.S. Azari and Y. Pawitan (1988). Modeling mortality fluctuations in LosAngeles as functions of pollution and weather effects. Environmental Research, 45,224-241.

Shumway, R.H. and D.S. Stoffer (2000). Time Series Analysis & Its Applications. NewYork: Springer-Verlag.

Tiao, G.C. and R.S. Tsay (1983). Consistency properties of least squares estimates ofautoregressive parameters in ARMA models. Annals of Statistics, 11, 856-871.

Tsay, R.S. (2005). Analysis of Financial Time Series, 2th Edition. John Wiley & Sons,New York.

Page 90: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

Chapter 6

Robust and Quantile Regressions

There are a lot of materials to cover both robust regression and linear quantile regression.

However here we give only some basic concepts and brief methodology. For more detailed

discussions, see the books by Huber (1981) and Koenker (2005).

6.1 Robust Regression

Robustness means that good statistical procedures should work well even if the underlying

assumptions are somewhat in error. For a classical regression model

Yi = βTXi + εi,

the general form of a robust estimation is the estimator to minimizing the function

n∑

i=1

ρ(Yi − βTXi),

where ρ(·) is a function (“loss” function) to be specified. If ρ(z) = z2 (quadratic), then it is

the usual least squares criterion. If ρ(z) = |z|, it gives the least absolute deviation (LAD)

estimator (median regression). Another important choice is the Huber (1981)’s function

ρc(z) =

z2, if |z| ≤ c,c|z| − c2/2, if |z| > c,

where c is a fixed constant, or

ρc,0(z) =

z2, if |z| ≤ c,c2, if |z| > c,

and the corresponding estimator is calledM-estimator. There are a lot of choices of ρ(·); see

the book by Serfling (1980) for details. This choice for ρc(·) results in down-weighting

80

Page 91: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 6. ROBUST AND QUANTILE REGRESSIONS 81

cases with large residuals. For more details on robust regression, see the book by Huber

(1981).

To implement the robust regression in R, you can use the command rlm() or lqs()

(M-estimator) in the package MASS [library(MASS)] .

rlm(formula, data, weights, ..., subset, na.action,

method = c("M", "MM", "model.frame"),

wt.method = c("inv.var", "case"),

model = TRUE, x.ret = TRUE, y.ret = FALSE, contrasts = NULL)

lqs(formula, data, ...,

method = c("lts", "lqs", "lms", "S", "model.frame"),

subset, na.action, model = TRUE,

x.ret = FALSE, y.ret = FALSE, contrasts = NULL)

6.2 Quantile Regression

The above LAD estimator is indeed a special case (τ = 1/2) of the following quantile

regressionn∑

i=1

ρτ (Yi − βTXi),

where the loss function ρτ (z) = z(τ − Iz<0) is also called the loss (“check”) function, and

IA is the indicator function of any set A. All the aforementioned loss functions are plotted

in Figure 6.1 using R (the R code for this figure can be found in Section 7.3). Indeed, the

above problem can be formulated as a general quantile regression, for any 0 < τ < 1,

Fy(qτ (Xi) |Xi) = τ,

where Fy(y |Xi) is the conditional distribution of Yi given Xi, and qτ (Xi) = βTτ Xi which

allows the coefficients to depend on τ . This linear quantile regression model was proposed

by Koenker and Bassett (1978, 1982). For details, see the papers by Koenker and Bassett

(1978, 1982). There are several advantages of using a quantile regression:

• A quantile regression does not require knowing the distribution of the

dependent variable.

Page 92: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 6. ROBUST AND QUANTILE REGRESSIONS 82

−2 −1 0 1 2

01

23

4

Loss Functions

quadradicHuberHuber0tau=0.05LADtau=0.95

−c c

Figure 6.1: Different loss functions for Quadratic, Huber’s ρc(·), ρc,0(·), ρ0.05(·), LAD andρ0.95(·), where c = 1.345.

• It does not require the symmetry of the measurement error.

• It can characterize the heterogeneity.

• It can estimate the mean and variance simultaneously.

• It is a robust procedure.

There are a lot more.

Clearly, a quantile regression model includes the LAD and the classical mean regression

models as a special case. To see this, for a classical regression model

Yi = βTXi + εi,

its τ -th quantile is

qτ (Xi) = βTXi + qτ,i(ε),

where qτ,i(ε) is the τ -th quantile of εi. For example, if εi = σ(Xi)ui with iid ui, then,

qτ,i(ε) = σ(Xi) qτ,u, where qτ,u is the τ -th quantile of ui, and

qτ (Xi) = βTXi + σ(Xi) qτ,u.

Page 93: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 6. ROBUST AND QUANTILE REGRESSIONS 83

This property can be used as an informative way (graphically) to detect whether σ(Xi) is

constant or not by plotting two quantile functions qτ1(Xi) and qτ2(Xi) for two distinct τ1 6= τ2.

If two quantile curves (lines) are parallel, this means that σ(Xi) is a constant. Further, if

σ(Xi) is a linear function of Xi, then the quantile regression model can be used to model the

conditional heteroscedasticity such as the autoregressive conditional heteroscedastic (ARCH)

and generalized autoregressive conditional heteroscedastic (GARCH) as well as stochastic

volatility (SV) models. For details on this aspect, see the papers by Koenker and Zhao (1996)

and Xiao (2006). Of course, one can model some nonlinear type models such as ARCH

or GARCH or SV models. For more details, see the book by Koenker (2005). Finally,

another application of conditional quantiles is the construction of prediction intervals

for the next value given a small section of recent past values in a stationary time series

(Granger, White, and Kamstra, 1989; Koenker, 1994; Zhou and Portnoy, 1996; Koenker

and Zhao, 1996; Taylor and Bunn, 1999). Also, Granger, White, and Kamstra (1989),

Koenker and Zhao (1996), and Taylor and Bunn (1999) considered an interval forecasting

for parametric autoregressive conditional heteroscedastic type models. For example, the

quantile regression can be used to construct a prediction interval as (q0.025(Xt), q0.975(Xt))

such as P (q0.025(Xt) ≤ Yt+1 ≤ q0.975(Xt) |Xt) = 0.95.

To fit a linear quantile regression using R, one can use the command rq() in the package

quantreg. For a nonlinear parametric model, the command is nlrq(). For a nonparametric

quantile model for univariate case, one can use the command lprq() for implementing the

local polynomial estimation. For an additive quantile regression, one can use the commands

rqss() and qss().

rq(formula, tau=.5, data, subset, weights, na.action,

method="br", model = TRUE, contrasts, ...)

lprq(x, y, h, tau = .5, m = 50)

6.3 Computer Codes

The following is the R code for making Figure 6.1.

# define functions

Page 94: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 6. ROBUST AND QUANTILE REGRESSIONS 84

rho1=function(x)x^2

rho2=function(x,c)x^2*(abs(x)<=c)+(c*abs(x)-c^2/2)*(abs(x)>c)

rho3=function(x,c)x^2*(abs(x)<=c)+c^2*(abs(x)>c)

rho_tau=function(x,tau)x*(tau-(x<0))

z=seq(-2,2,by=0.1)

c=1.345

y=cbind(rho1(z),rho2(z,c),rho3(z,c),rho_tau(z,0.05),rho_tau(z,0.5),

rho_tau(z,0.95))

text1=c("quadradic","Huber","Huber0","tau=0.05","LAD","tau=0.95")

c1=c*rep(1,10)

c2=seq(0,c,length=10)

c3=c2^2

postscript(file="c:/res-teach/xiamen12-06/figs/fig-6.1.ps",

horizontal=F,width=6,height=6)

# output the graph as a ps or eps file

#win.graph() # show the graph on the screen

par(bg="light green")

matplot(z,y,type="l",lty=1:6,ylab="",xlab="",ylim=c(-0.2,4))

points(c1,c3,type="l")

points(-c1,c3,type="l")

title(main="Loss Functions",col.main="red",cex=0.5)

legend(0,4,text1,lty=1:6,col=1:6,cex=0.7)

text(-c,-0.2,"-c")

text(c,-0.2,"c")

abline(0,0)

dev.off()

6.4 References

Huber, P.J. (1981). Robust Statistics. New York: Wiley.

Granger, C.W.J., White, H. and Kamstra, M. (1989). Interval forecasting: an analysisbased upon ARCH-quantile estimators. Journal of Econometrics, 40, 87-96.

Page 95: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 6. ROBUST AND QUANTILE REGRESSIONS 85

Koenker, R. (1994). Confidence intervals for regression quantiles. In Proceedings of theFifth Prague Symposium on Asymptotic Statistics (P. Mandl and M. Huskova, eds.),349-359. Physica, Heidelberg.

Koenker, R. (2005). Quantile Regression. Cambridge University Press, New York.

Koenker, R. and G.W. Bassett (1978). Regression quantiles. Econometrica, 46, 33-50.

Koenker, R. and G.W. Bassett (1982). Robust tests for heteroscedasticity based on regres-sion quantiles. Econometrica, 50, 43-61.

Koenker, R. and Q. Zhao (1996). Conditional quantile estimation and inference for ARCHmodels. Econometric Theory, 12, 793-813.

Serfling, R.J. (1980). Approximation Theorems of Mathematical Statistics. John Wiley,New York.

Taylor, J.W. and Bunn, D.W. (1999). A quantile regression approach to generating predic-tion intervals. Management Science, 45, 225-237.

Xiao, Z. (2006). Conditional quantile estimation for GARCH models. Working Paper,Department of Economics, Boston College.

Zhou, K.Q. and Portnoy, S.L. (1996). Direct use of regression quantiles to construct confi-dence sets in linear models. The Annals of Statistics, 24, 287–306.

Page 96: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

Chapter 7

How to Analyze Boston House PriceData?

7.1 Description of Data

The well known Boston house price data set1 consists of 14 variables, collected on each

of 506 different houses from a variety of locations. The Boston house-price data set was

used originally by Harrison and Rubinfeld (1978) and it was re-analyzed in Belsley, Kuh

and Welsch (1980) by various transformations in the table on pages 244-261. Variables are,

denoted by X1, · · · , X13 and Y , in order:

CRIM per capita crime rate by town

ZN proportion of residential land zoned for lots over 25,000 sq.ft.

INDUS proportion of non-retail business acres per town

CHAS Charles River dummy variable (= 1 if tract bounds river; 0

otherwise)

NOX nitric oxides concentration (parts per 10 million)

RM average number of rooms per dwelling

AGE proportion of owner-occupied units built prior to 1940

DIS weighted distances to five Boston employment centers

RAD index of accessibility to radial highways

TAX full-value property-tax rate per 10,000USD

PTRATIO pupil-teacher ratio by town

B 1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town

1This dataset can be downloaded from the web site at http://lib.stat.cmu.edu/datasets/boston.

86

Page 97: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 7. HOW TO ANALYZE BOSTON HOUSE PRICE DATA? 87

LSTAT lower status of the population

MEDV Median value of owner-occupied homes in $1000’s

The dependent variable is Y , the median value of owner-occupied homes in $1, 000’s

(house price). The major factors possibly affecting the house prices used in the literature

are:

X13=proportion of population of lower educational status,

X6=the average number of rooms per house,

X1=the per capita crime rate,

X10=the full property tax rate,

and

X11=the pupil/teacher ratio.

For the complete description of all 14 variables, see Harrison and Rubinfeld (1978) and Gilley

and Pace (1996) for corrections.

7.2 Analysis Methods

7.2.1 Linear Models

Harrison and Rubinfeld (1978) was the first to analyze this data set using a standard regres-

sion model Y versus all 13 variables including some higher order terms or transformations on

Y and Xj’s. The purpose of this study is to see whether there are the effects of pollution on

housing prices via hedonic pricing methodology. Belsley, Kuh and Welsch (1980) used this

data set to illustrate the effects of using robust regression and outlier detection strategies.

From these results, we might conclude that the model might not be linear and there might

exist outliers. Also, Pace and Gilley (1997) added a georeferencing idea (spatial statistics or

econometrics) and used a spatial estimation method to consider this data set.

Exercise: Please use all possible methods to explore this dataset to see what is the best

linear model (including higher orders) you can obtain.

Page 98: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 7. HOW TO ANALYZE BOSTON HOUSE PRICE DATA? 88

7.2.2 Nonparametric Models

Mean Additive Models

There have been several papers devoted to the analysis of this dataset using some non-

parametric methods. For example, Breiman and Friedman (1985), Pace (1993), Chaudhuri,

Doksum and Samarov (1997), and Opsomer and Ruppert (1998) used four covariates: X6,

X10, X11 and X13 or their transformations (including the transformation on Y ) to fit the

data through a mean additive regression model such as

log(Y ) = µ+ g1(X6) + g2(X10) + g3(X11) + g4(X13) + ε, (7.1)

where the additive components gj(·) are unspecified smooth functions. Pace (1993) and

Chaudhuri, Doksum and Samarov (1997) also considered the nonparametric estimation of the

first derivative of each additive component which measures how much the response changes as

one covariate is perturbed while the other covariates are held fixed; see Chaudhuri, Doksum

and Samarov (1997). Let us use model (7.1) to fit the Boston house price data. The results

are summarized in Figure 7.1 (the R code can be found in Section 7.3).

To fit an additive model or a partially additive model in R, the function is gam() in the

package gam. For details, please look at the help command help(gam) after loading the

package gam [library(gam)].

gam(formula, family = gaussian, data, weights, subset, na.action,

start, etastart, mustart, control = gam.control(...),

model=FALSE, method, x=FALSE, y=TRUE, ...)

Note that the function gam() allows to fit a semi-parametric additive model as

log(Y ) = µ+ g1(X6) + β2X10 + β3X11 + β4X13 + ε, (7.2)

which can be done by specifying some components without smooth.

Fan and Jiang (2005) used the generalized likelihood ratio (GLR) [see Cai, Fan and Li

(2000), Cai, Fan and Yao (2000) and Fan, Zhang and Zhang (2001)] test to test if any

component is linear or insignificant. In other words, they tested an additive model versus

a partially additive model or a smaller additive model. That is, H0 : gj(x) = a + bx or

H0 : gj(x) = 0. Let us use model (7.2) to fit the Boston house price data. The results are

summarized in Figure 7.2 (the R code can be found in Section 7.3).

Page 99: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 7. HOW TO ANALYZE BOSTON HOUSE PRICE DATA? 89

4 5 6 7 8

0.0

0.1

0.2

0.3

0.4

x6

lo(x

6)

200 300 400 500 600 700

−0.1

00.

000.

100.

20

x10

lo(x

10)

14 16 18 20 22−0.1

00.

000.

050.

100.

15

x11

lo(x

11)

10 20 30−0.8

−0.4

0.0

0.2

0.4

x13

lo(x

13)

Component of X_13

Figure 7.1: The results from model (7.1).

Quantile Regression Approaches

Some existing analyses (e.g., Breiman and Friedman, 1985) in mean regression models con-

cluded that most of the variation seen in housing prices in the restricted data set can be

explained by two major variables: X6 and X13. Indeed, the correlation coefficients between

Y and X13 and X6 are −0.7377 and 0.6954, respectively. Figure 7.3 gives the scatter plots

of the house price versus the covariates X13, X6, X1 and X∗1 , respectively The interesting

features of this dataset are that the response variable is the median price of a home in a

given area and the distributions of Y and the major covariate X13 are left skewed (please

look at the density estimates; see Figure 7.2(d) for the density estimate of Y which can be

done by using the command density() in R).

To overcome the skewness, one way to use a transformation such as the Box-Cox trans-

formation as mentioned above. Another way (a better way) is to use the quantile regression.

Therefore, Yu and Lu (2004) employed the additive quantile technique to analyze the

Page 100: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 7. HOW TO ANALYZE BOSTON HOUSE PRICE DATA? 90

oo

oo

ooo

oo

oo o

oooo

oo ooo

ooo

oooo o

oo

o

o

oo

ooo

oo

oooo

oo

oo

o

ooo oo

ooo oo

ooo

oo o

ooo

oo oo

oooooo

oo

ooooo ooo

o

ooooo

o

oo o

oo

oo

ooo

oo

oo

o o

ooo

o

o oo

oo o

oo

o

o oo

o oo oo

oo

ooo

oooo

o

oo

o

o

o

o

o

o oo

o

o

oo

o

o

ooo

oooooooo

oo o

ooo oo o

o o o

o

oooo

ooo o

oo ooo

o

oo oooo o o ooo

oo

oo

oo o

o

oo oo

o o

o

oo

oo

o

oo

oo

o o

ooo ooooo

oo oo

o ooo

ooo o

ooo

ooo

oo

oo

oo

o

o o

ooo oo ooo

ooo

o oo oo

o

oo ooo

o

ooooooo

ooo

o

o ooooo

oo

o

oo oo

o

o o oo oooo

ooo

o

oo

oo o oooooooo

o

o

oo

oo o oooo

o oo o

o ooo oooo

o

o

o

o

ooo

oo

oo

o

oo

o

o

o

ooo

oo

o

o

oo

o

o

o oooo

o

o

o

o o

o

o

o

o

o

ooo

o

o

o

o

ooo

o

oo

oo

oo

oooo

oo

oo oo

o

o

o

o

oo

o

o oo

oo o

o

ooo

oo

oooo oo

o

ooo

oooo

oo

ooo

o o

o

o oo o

ooo o

oooo o

o

o

o

o

oo

ooo

oo

oooo

oo

o

2.5 3.0 3.5 4.0

−0.5

0.0

0.5

1.0

y_hat

Residual Plot of Additive Model

oo

ooo

ooo

oo

oooooooo

oo

ooooo

oo o

ooo

ooo

oooooo

ooooo

oo o

oo

ooo

oo

o

oo

ooo oo

o

o

ooo

oo

ooooooooo

o oooo o ooooo

ooooo oo

ooo

ooo

oooo oooo ooooo oooooooooo

oo

o

o

ooooo

oo

oo oo

o

oo

o

oo

o

o

oo

o

o

ooo

o

oo

oo

oo

o

o o

o

ooo

ooo

oo

oo

ooo

o

o

o

o

oo

o

oo

oo

o

o

oo

o

ooo

oo

o

ooo

oo

o o

o

o

o

oo

o

oo ooo

oo

oo

oo

o

o

o

oo

o

oo

oo o

o

ooo

ooo

ooooo

ooo

o

o

oo

o

o

o

o

oo

o

ooo o

o

o

oo o o

o

oo

o

oo

o

o

o

oo

o

oooo oo

o

ooo ooo

o

oo

oo

oo

o

o

ooo

o

oo o oo

ooo

ooo

oo

ooooo

oo

oo

oooooooo

o

ooo

oooo

o

ooo o

oo oo

oooo

o

o

o

o

o

o

o

o

o

oo

o

o

o

oo

oo

o

o

oo

o

o

o

oo

o

oo

o

oooo

oo

oooo

ooo

o

oo ooo

o

o

o

oo

o

o oo

oo

oo

oo ooooo

ooo oooo

oo

oo oo

oooo oooo

o

oo

ooo

o ooooo

ooooo

ooo

oo

oo

o

ooo oo

ooooo

oo

o

ooo

oo

o

o oo

ooo

ooo

4 5 6 7 8

0.0

0.1

0.2

0.3

0.4

0.5

0.6

x6

s1(x

6)

Component of X_6

oo

ooooo

oo

oo o

o

ooooo o

ooo

oo oooo oo

o

o

o

oo o

o o

o oo

ooo

oo

oo

oo

oo oo

oo

o ooo

oo o

oo

ooo

o o ooo

oooooo

ooooo

o oo

o

o

ooooooo

ooo

o

oo

ooo

oo

oo

o oo

oo

o

o oo

oooo

oo

o oo

o oo

ooo

oooo

oo

oo

o

oo

o

o

o

o

o

ooo

o

o

oo

o

o

ooo

oooo o

ooooo o

o

oooo

oo

oo

o

oooo

ooo o

oo ooo

o

oo ooo

oo ooo

o

o ooo

oo o

o

oo oo

o o

o

oo

oo

oo

ooo

o o

ooo ooooo

oooo

o ooo

ooo o o

oo

oooo

oo

ooo

o

oo

ooo oo oooooo o oo o oo

oo oo

oo

ooooooo

ooo

o

o ooo

oo

oo

ooo o

o

o

o o oo oooo

ooo

o

o ooo o

ooooo oooo

o

oo

oo o ooooo

oo o

o ooo oooo

o

o

o

o

ooo

oo

oo

o

oo

o

o

o

oo

o

oo

oo

oo

o

o

o oooo

o

o

o

o o

o

o

o

o

o

ooo

o

o

o

o

oo

o

o

oo

oo

oo

oo

oo

oo

oo oo

o

o

o

o

o o

o

o oo

oo o

o

ooo

oo

oooooo

o

ooo

o

oooo

o ooo

o o

o

o oo o

o oo o

oooo o

o

o

o

o

oo

oo

o

oo

oo

oo

oo

o

2.5 3.0 3.5 4.0

−0.5

0.0

0.5

1.0

y_hat

Residual Plot of Model II

0 10 20 30 40 500.00

0.02

0.04

0.06

Density of Y

Figure 7.2: (a) Residual plot for model (7.1). (b) Plot of g1(x6) versus x6. (c) Residual plotfor model (7.2). (d) Density estimate of Y .

data to fit the following additive quantile model, for any 0 < τ < 1,

qτ (X) = δτ + θτ,1(X6) + θτ,2(X∗10) + θτ,3(X11) + θτ,4(X

∗13), (7.3)

where X∗10 = log(X10), X

∗13 = log(X13), and the additive components θτ,j(·) are unspecified

smooth functions. Unfortunately, the detailed graphs are not provided and you are asked to

write a code for model (7.3) to present the computational results, including graphs. Note that

you can use the command rqss() in R to compute the estimated values of each component

in model (7.3).

Recently, Senturk and Muller (2005a, 2005b, 2006) studied the correlation between the

house price Y and the crime rate X1 adjusted by the confounding variable X13 through a

varying coefficient model and they concluded that the expected effect of increasing crime

rate on declining house prices seems to be only observed for lower educational status neigh-

borhoods in Boston. Finally, it is surprising that all the existing nonparametric models

Page 101: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 7. HOW TO ANALYZE BOSTON HOUSE PRICE DATA? 91

10 20 30

1020

3040

50

Plot of Price vs LSTAT

x134 5 6 7 8

1020

3040

50

Plot of Price vs RM

x6

0 20 40 60 80

1020

3040

50

Plot of Price vs CRIM

x1−4 −2 0 2 4

1020

3040

50

Plot of Price vs log(CRIM)

log(x1)

Figure 7.3: Boston Housing Price Data: Displayed in (a)-(d) are the scatter plots of thehouse price versus the covariates X13, X6, X1 and X∗

1 , respectively.

aforementioned above did not include the crime rate X1, which may be an important factor

affecting the housing price, and did not consider the interaction terms such as X13 and X1.

Based on the above discussions, Cai and Xu (2005) studied the following nonparametric

quantile regression model, which might be well suitable to the analysis of this dataset,

qτ (X) = a0,τ (X13) + a1,τ (X13)X6 + a2,τ (X13)X∗1 , 1 ≤ t ≤ n = 506, (7.4)

where X∗1 = log(X1). The reason for using the logarithm of X1 in (7.4), instead of X1 itself,

is that the correlation between Yt and X∗1 is slightly stronger than that for Yt and X1. The

estimated curves for three coefficient functions are displayed in Figure 7.4. In the model

fitting, covariates X6 and X∗1 can be centralized. Of course, you can add more covariates

into model (7.4) and then you need to consider the model diagnostics to see whether some

covariates are significant. Also, you might explore some semi-parametric models to this data

set. Unfortunately, the detailed code is not provided and you are asked to write a code by

Page 102: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 7. HOW TO ANALYZE BOSTON HOUSE PRICE DATA? 92

5 10 15 20 25

1015

2025

3035

40

(a)

tau=0.05tau=0.50tau=0.95MeanCI for tau=0.5

5 10 15 20 25

05

10

(b)

tau=0.05tau=0.50tau=0.95MeanCI for tau=0.5

5 10 15 20 25

−3−2

−10

12

34

(c)

tau=0.05tau=0.50tau=0.95MeanCI for tau=0.5

Figure 7.4: Boston Housing Price Data: The plots of the estimated coefficient functions forthree quantiles τ = 0.05 (solid line) for model (7.4), τ = 0.50 (dashed line), and τ = 0.95(dotted line), and the mean regression (dot-dashed line): a0,τ (u) and a0(u) versus u in (a),a1,τ (u) and a1(u) versus u in (b), and a2,τ (u) and a2(u) versus u in (c). The thick dashedlines indicate the 95% point-wise confidence interval for the median estimate with the biasignored.

yourself.

7.3 Computer Codes

The following is the R code for making Figures 7.1 and 7.2.

data=read.table("c:/res-teach/xiamen12-06/data/ex7-1.txt")

y=data[,14]

x1=data[,1]

x6=data[,6]

x10=data[,10]

x11=data[,11]

x13=data[,13]

Page 103: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 7. HOW TO ANALYZE BOSTON HOUSE PRICE DATA? 93

y_log=log(y)

library(gam)

fit_gam=gam(y_log~lo(x6)+lo(x10)+lo(x11)+lo(x13))

resid=fit_gam$residuals

y_hat=fit_gam$fitted

postscript(file="c:/res-teach/xiamen12-06/figs/fig-7.1.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(2,2),mex=0.4)

plot(fit_gam)

title(main="Component of X_13",col.main="red",cex=0.6)

dev.off()

fit_gam1=gam(y_log~lo(x6)+x10+x11+x13)

s1=fit_gam1$smooth[,1] # obtain the smoothed component

resid1=fit_gam1$residuals

y_hat1=fit_gam1$fitted

print(summary(fit_gam1))

postscript(file="c:/res-teach/xiamen12-06/figs/figs/fig-7.2.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(2,2),mex=0.4)

plot(y_hat,resid,type="p",pch="o",ylab="",xlab="y_hat")

title(main="Residual Plot of Additive Model",col.main="red",cex=0.6)

abline(0,0)

plot(x6,s1,type="p",pch="o",ylab="s1(x6)",xlab="x6")

title(main="Component of X_6",col.main="red",cex=0.6)

plot(y_hat1,resid1,type="p",pch="o",ylab="",xlab="y_hat")

title(main="Residual Plot of Model II",col.main="red",cex=0.5)

abline(0,0)

plot(density(y),ylab="",xlab="",main="Density of Y")

dev.off()

Page 104: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 7. HOW TO ANALYZE BOSTON HOUSE PRICE DATA? 94

7.4 References

Belsley, D.A., E. Kuh and R.E. Welsch (1980). Regression Diagnostic: Identifying Influen-tial Data and Sources of Collinearity. New York: Wiley.

Breiman, L. and J.H. Friedman (1985). Estimating optimal transformation for multipleregression and correlation. Journal of the American Statistical Association, 80, 580-619.

Cai, Z., J. Fan and R. Li (2000). Efficient estimation and inferences for varying-coefficientmodels. Journal of the American Statistical Association, 95, 888-902.

Cai, Z., J. Fan and Q. Yao (2000). Functional-coefficient regression models for nonlineartime series. Journal of the American Statistical Association, 95, 941-956.

Cai, Z. and X. Xu (2005). Nonparametric quantile estimations for dynamic smooth coeffi-cient models. Revised for Journal of the American Statistical Association.

Chaudhuri, P., K. Doksum and A. Samarov (1997). On average derivative quantile regres-sion. The Annuals of Statistics, 25, 715-744.

Fan, J. and J. Jiang (2005). Nonparametric inference for additive models. Journal of theAmerican Statistical Association, 100, 890-907.

Fan, J., C. Zhang and J. Zhang (2001). Generalized likelihood ratio statistics and Wilksphenomenon. The Annals of Statistics, 29, 153-193.

Gilley, O.W. and R.K. Pace (1996). On the Harrison and Rubinfeld Data. Journal ofEnvironmental Economics and Management, 31, 403-405.

Harrison, D. and D.L. Rubinfeld (1978). Hedonic housing prices and demand for clean air.Journal of Environmental Economics and Management, 5, 81-102.

Opsomer, J.D. and D. Ruppert (1998). A fully automated bandwidth selection for additiveregression model. Journal of The American Statistical Association, 93, 605-618.

Pace, R.K. (1993). Nonparametric methods with applications hedonic models. Journal ofReal Estate Finance and Economics, 7, 185-204.

Pace, R.K. and O.W. Gilley (1997). Using the spatial configuration of the data to improveestimation. Journal of the Real Estate Finance and Economics, 14, 333-340.

Senturk, D. and H.G. Muller (2005a). Covariate-adjusted regression. Biometrika, 92, 75-89.

Senturk, D. and H.G. Muller (2005b). Covariate adjusted correlation analysis. Scandina-vian Journal of Statistics, 32, 365-383.

Senturk, D. and H.G. Muller (2006). Inference for covariate adjusted regression via varyingcoefficient models. The Annals of Statistics, 34, 654-679.

Page 105: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 7. HOW TO ANALYZE BOSTON HOUSE PRICE DATA? 95

Yu, K. and Z. Lu (2004). Local linear additive quantile regression. Scandinavian Journalof Statistics, 31, 333-346.

Page 106: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

Chapter 8

Value at Risk

8.1 Methodology

Value at Risk (VaR) is one of the basic measures of the risk of a loss in a particular

portfolio. It is typically stated in a form like, “there is a one percent chance that, over a

particular day of trading, the position could lose” some dollar amount. We can state it more

generally as VaR(p,N) = V , where V is the dollar amount of the possible loss, p is the

percentage (managers might commonly be interested in the 1%, 5%, or 10%), and N is the

time period (one day, one week, one month). To get from one-day VaR to N -day VaR, if the

risks are independent and identically distributed, we multiply by√N (Please verify this.

Do you need any assumption?).

Or, in one irreverent definition, VaR is “a number invented by purveyors of panaceas for

pecuniary peril intended to mislead senior management and regulators into false confidence

that market risk is adequately understood and controlled”. VaR is called upon for a variety

of tasks (enumerated in Duffie & Pan 1997): measure risk exposure; ensure correct capital

allocation; provide information to counter-parties, regulators, auditors, and other stake-

holders; evaluate and provide incentives to profit centers within the firm; and protect against

financial distress. Note that the final desired outcome (protection against loss) is not the

only desired outcome. Being too safe costs money and loses business!

VaR is an important component of bank regulation: the Basle Accord sets capital based

on the 10-day 1% VaR (so, if risks are iid, then the 10-day VaR is three times larger than

the one-day). The capital is set at least 3 times as high as the 10-day 1% VaR, and can be

as much as four times higher if the bank’s own VaR calculations have performed poorly in

96

Page 107: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 97

the past year. If a bank hits their own 1% VaR more than 3 times in a year (of 250 trading

days) then their capital adequacy rating is increased for the next year. Poor modeling can

carry a high cost!

If we graph the distribution of possible returns of a portfolio, we can interpret VaR as a

measure of a percentile. If we plot the cumulative distribution function (cdf) of the portfolio

value, then the VaR is simply the inverse of the probability:

VaR = F−1(p)

Or if we graph the probability density function (pdf), then VaR is the value that gives a

particular area under the curve. This is obviously similar to hypothesis testing in statistics.

Typically we are concerned with the lower tail of losses; therefore some people might

refer to either the 1% VaR or the 99% V aR. From the mathematical description above, the

99% VaR would seem to be the upper tail probability (therefore the loss in a short position)

but this is rarely the intent - the formulation typically refers to the idea that 99% of losses

will be smaller (in absolute value) than V .

There are three basic methods of computing the Value at Risk of some portfolio:

(a) parametric models that assume some error distribution,

(b) historical returns that use a selection of past data, and

(c) computationally intensive Monte Carlo simulations that draw thousands of possible

future paths.

Each of these typically has two steps: first determine a possible path for the value of some

underlying asset, then determine the change in the value of the portfolio assets due to it.

The parametric model case assumes that future returns can be modeled with some

statistical distribution such as the normal. A statistician would note that P ((X − µ)/σ <

−Z) = p and, in the case of the normal distribution, for p = 0.01 (a 1% one-tailed critical

value), −Z = −2.326. For other distributions than the normal distribution, there would

be different critical values. Sometimes a modeler would use the t-distribution (with some

Page 108: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 98

degrees of freedom) or even the Cauchy (a t-distribution with 1 df) to give heavier tails

and generalized Pareto distribution (see later) or some more sophisticated and complex

distributions (for example, hyperbolic distributions; see Eberlein and Keller (1995)).

Regardless of what distribution is chosen, the above probability can be re-written from

P (X < µ− σ Z) = p, so that the quantity, (µ− σ Z), is exactly V , the value at risk.

An extension of the parametric approach is to use the Greeks that we have derived from

the Black-Scholes-Merton model (under the assumption of normally-distributed errors) to

calculate possible changes in the portfolio values.

We can also incorporate information from options prices into the calculation of forward

value at risk. For instance, if S&P 500 option prices imply a high VIX (volatility) then the

model could use this volatility instead of the historical average. In extracting information

from derivatives prices we must be careful of the difference between the risk-neutral distri-

bution and the distribution of actual price changes VaR is based on actual changes not some

perfectly-hedged portfolio!

The basic method of historical returns would just select some reference period of

history (say, the past 2 years), order the returns from that period, and select the 100 × p%

worst one to represent the value at risk based on past trading history. Basically instead

of using some parametric cdf, this method uses a historical cdf (an empirical cdf). The

choice of historical period means that this method is not quite as simple as it might sound.

The parametric method has a potential misspecification if the true distribution is not the

specified one. However, the historical cdf (the empirical cdf) is always consistent estimate

of the true cdf. Therefore, the historical method would be better although the statistical

involvements are much more.

Monte Carlo simulations use computational power to compute a cumulative distri-

bution function. Possible future paths of asset prices and derivatives are computed and the

value of the portfolio calculated along each one. If the errors are drawn from a parametric

distribution then the Monte Carlo method will, in the limit, provide the same result. If the

errors are drawn from a historical distribution then the Monte Carlo method will give the

same result as that method. Typically a Monte Carlo simulation will use some of each.

Each method has strengths and weaknesses. The parametric method is quick and easy to

Page 109: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 99

implement with a minimum of computing power. It is good for calculating day-to-day trading

swings for thickly-traded assets and conventional derivatives (plain vanilla or with well-

specified Greeks). But most common probability distributions were developed for making

inferences about behavior in the center of the probability mass means and medians. The

tails tend to be poorly specified and so do not perform as well.

Each method must choose which underlying variables are considered and which covari-

ances. The covariances are tough (particularly for historical simulations) since the “curse

of dimensionality” quickly takes hold. For example, consider a histogram of 250 obser-

vations (one trading year) of returns on a single variable: with the distribution of the same

250 observations but now incorporating the joint behavior of two variables: you might see

that now many of the boxes represent a single observation (they are just 1 unit high) while

the possible maximum value is 8 (down from 40 previously). A series of “historical simu-

lations” that drew from these observations would have very few options, compared with a

parametric draw that needs only the correlation between the variables and a specification of

a distributional form. The imposition of a normal distribution might be unrealistic but it

might be less bad than the alternatives.

To handle the dimensionality problem, we might use principal components analy-

sis, that breaks down complex movements of variables (like the yield curve) into simpler

orthogonal “factor” movements that can then be used in place of the real changes.

To consider the VaR constructions in more detail, consider how to define the VaR for a

portfolio with just a single asset. If this asset has a normal distribution then a parametric

VaR would say there is a 1% chance that a day’s trading will end more than −2.326 standard

deviations below average (see the example above for details on calculating V from a Z

statistic). We would use information on the past performance of the asset to determine its

average and standard deviation. A historical simulation VaR would rank, say, the past 4

years of data (approximately 1000 observations) to find the worst 10.

For the last 4 years data on the OEX (S&P 100) (please download this dataset and

analyze it by yourself), the mean daily return (expressed in logs) is 0.018% and its standard

error is 0.013. Using the 1% Z value that we had above −2.326, we find a parametric 1-day

1% VaR of −3.06%. If we instead use the historical returns we find a 1-day 1% VaR of

Page 110: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 100

3.49%. How different are these numbers? In that same time period, the parametric VaR

was exceeded on 1.3% of days; the historical returns VaR would be hit only 0.4% of the

time if the parametric model were correct. We should add a number of caveats, principally

to note that the historical period which we are considering includes September 2001, when

trading stopped for a week. When trading resumed the S&P 100 had fallen over 5%, but

this underestimates the financial impact. Other world markets were open so any positions

that crossed international borders would have lost more. We might choose not to include

that episode, in which case the VaR numbers rise to a parametric 3.03% or historical 3.28%.

There is a general tradeoff between the amount of historical data used and the degree

of imposed inference. No model is completely theory-independent (the historical simulation

model assumes that the future will be sampled from some defined past). As the amount

of theory increases, more inferences can be drawn but at a cost: if the theory is incorrect

then the inferences will be worse (although if the theory is right then the inferences will be

sharper).

All these methods share a weakness of looking to the past for guidance, rather than

seriously considering future events that are genuinely new, and tries to push them further,

by asking how a portfolio position would react so some of the most extreme price/volume

changes in past decades.

Back Testing acts to check out that the percentages seem to line up. Not only must

the number of actual “hits” be close to the predicted number, but they should also be

independent of each other. “Hits” that are clustered indicate that the VaR is not adequately

dependent upon current market valuations. Back testing can also explore the adequacy of

other percentile measures of VaR, on the insight that, if the 5% VaR (or 10%, or 0.5%, or

whatever) is wrong then we might have more skepticism for the validity of the 1% measure.

For more details, see Chapter 6 of Crouhy, Galai, and Mark (2001).

All of these methods also find only a single percentile of the distribution. Suppose a firm

has some exotic option which will be worth $100 in 99.5% of the trading days but will lose

$1000 on 0.5%. Then the 1% VaR is $100. Even if the option lost $10,000 or $1,000,000

on 0.5% of trading days, the calculated VaR would not change! In the pictures of the cdf

Page 111: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 101

and pdf, this is simply that the left-hand tail can be drawn out to stretch into huge losses.

Therefore some advocate the calculation of a Conditional VaR, which is the expected loss

conditional on it being among the worst 1% of trading days.

The VaR methods all basically assume that the trading desk is a machine that can gener-

ate any basket of risk/return on the efficient frontier that is desired; the role of management

is to specify how much risk they want to take on (and therefore how high of a return they

want). This assumes that risk is a univariate measure with a defined specification, but there

are many sources of risk. If management is genuinely so disconnected from the trading desk

that is taking on this risk, then there is a real problem that no VaR can solve! If management

is more closely involved, then they should understand the panoply of risks.

VaRs are a tool of risk management. Emanuel Derman writes that risk managers “need to

love markets and know what to worry about: bad marks, misspecified contracts, uncertainty,

risk, value-at-risk, illiquid positions, exotic options” (in Risk, May 2004). VaR is one part

of a long list.

8.2 R Commends (R Menu)

The package VaR contains mainly two functions VaR.gpd() for using a fit of Generalized

Pareto Distribution (GPD) and VaR.norm() for a log normal distribution. This function

VaR.gpd() estimates Value at Risk and Expected Shortfall (ES) of a single risk factor with

a given confidence by using a fit of GPD to the part of data exceeding a given threshold

(Peak Over Threshold (POT) Method). The input data transformed to precentral daily

return. Then, transformed data is sorted and only part exceeding a given threshold is hold.

Threshold is calculated according an expression p.tr*std. Log likelihood fit is then applied

to get values of VaR and ES. After that, confidence intervals for this values are calculated;

see Embrechts, Kluppelberg and Mikosch (1997) and Tsay (2005) for details. The R menu

for VaR package is VaR.pdf, which can be downloaded, for example, from the web site at

http://cran.cnr.berkeley.edu/doc/packages/VaR.pdf.

8.2.1 Generalized Pareto Distribution

The Pareto distribution is a skewed, heavy-tailed distribution that is sometimes used to

model the distribution of incomes. Let F (x) = 1 − x−1/k for x > 1, where k > 0 is a

Page 112: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 102

parameter. The distribution defined by this function is called the Pareto distribution with

shape parameter k, and is named for the economist Vilfredo Pareto. The density function

f(·) is given by f(x) = k−1x−(1+1/k) for x > 1. Clearly, f(x) is decreasing for x > 1. f(·)decreases faster as k decreases and the quantile function is F−1(p) = (1−p)−k for 0 < p < 1.

The Pareto distribution is a heavy-tailed distribution. Thus, the mean, variance, and other

moments are finite only if the shape parameter k satisfies certain conditions.

As with many other distributions, the Pareto distribution is often generalized by adding

a scale parameter. Thus, suppose that Z has the Pareto distribution with shape parameter

k. If σ > 0, the random variable X = σ Z has the Pareto distribution with shape parameter

k and scale parameter σ. Note that X takes values in the interval (σ, ∞). Analogues

of the results given above follow easily from basic properties of the scale transformation.

The density function is f(x) = k−1 σ1/k x−(1+1/k) for x > σ and distribution function is

F (x) = 1− (x/σ)−1/k for x > σ. The quantile function is F−1(p) = σ (1−p)−k for 0 < p < 1.

A more general form for the generalized Pareto distribution with shape parameter k 6= 0,

scale parameter σ, and threshold parameter θ, is

f(x) =1

σ

(1 + k

x− θ

σ

)1/k

for θ < x, when k > 0, or for θ < x < −σ/k when k < 0. In the limit for k = 0, the density

is

f(x) =1

σexp

(−x− θ

σ

)

for θ < x. If k = 0 and θ = 0, the generalized Pareto distribution is equivalent to the

exponential distribution. If k > 0 and θ = σ, the generalized Pareto distribution is equivalent

to the Pareto distribution.

8.2.2 Background of the Generalized Pareto Distribution

Like the exponential distribution, the generalized Pareto distribution is often used to model

the tails of another distribution. For example, you might have washers from a manufacturing

process. If random influences in the process lead to differences in the sizes of the washers,

a standard probability distribution, such as the normal, could be used to model those sizes.

However, while the normal distribution might be a good model near its mode, it might not

Page 113: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 103

be a good fit to real data in the tails and a more complex model might be needed to describe

the full range of the data. On the other hand, only recording the sizes of washers larger

(or smaller) than a certain threshold means you can fit a separate model to those tail data,

which are known as exceedance. You can use the generalized Pareto distribution in this way,

to provide a good fit to extremes of complicated data.

The generalized Pareto distribution allows a continuous range of possible shapes that

includes both the exponential and Pareto distributions as special cases. You can use either

of those distributions to model a particular dataset of exceedance. The generalized extreme

value distribution allows you to “let the data decide” which distribution is appropriate.

The generalized Pareto distribution has three basic forms, each corresponding to a limiting

distribution of exceedance data from a different class of underlying distributions.

(1) Distributions whose tails decrease exponentially, such as the normal, lead to a

generalized Pareto shape parameter of zero.

(2) Distributions whose tails decrease as a polynomial, such as Student’s t, lead to

a positive shape parameter.

(3) Distributions whose tails are finite, such as the beta, lead to a negative shape

parameter.

8.3 Reading Materials I and II (see Handouts)

The reading materials I and II are from Chapters 5 and 6 in the book by Crouhy, Galai, and

Mark (2001) about the basic concepts and methodologies on the value at risk.

8.4 New Developments (Nonparametric Approaches)

For details, see the paper by Cai and Wang (2006). If you like to read the whole paper, you

can download it from the web site at http://www.wise.xmu.edu.cn/ at WORKING

PAPER column. Next we present only a part of the whole paper of Cai and Wang (2006).

8.4.1 Introduction

The value-at-risk (hereafter, VaR) and expected shortfall have become two popular mea-

sures on market risk associated with an asset or a portfolio of assets during the last decade.

Page 114: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 104

In particular, VaR has been chosen by the Basle Committee on Banking Supervision as the

benchmark of risk measures for capital requirements and both of them have been used by

financial institutions for asset managements and minimization of risk as well as have been

developed rapidly as analytic tools to assess riskiness of trading activities. See, to name just

a few, Morgan (1996), Duffie and Pan (1997), Jorion (2001, 2003), and Duffie and Singleton

(2003) for the financial background, statistical inferences, and various applications. In terms

of the formal definition, VaR is simply a quantile of the loss distribution (future portfolio

values) over a prescribed holding period (e.g., 2 weeks) at a given confidence level, while ES

is the expected loss, given that the loss is at least as large as some given quantile of the loss

distribution (e.g., VaR). It is well known from Artzner, Delbaen, Eber and Heath (1999)

that ES is a coherent risk measure such as it satisfies the following four axioms:

• homogeneity: increasing the size of a portfolio by a factor should scale

its risk measure by the same factor,

• monotonicity: a portfolio must have greater risk if it has systematically

lower values than another,

• risk-free condition or translation invariance: adding some amount of

cash to a portfolio should reduce its risk by the same amount, and

• subadditivity: the risk of a portfolio must be less than the sum of

separate risks or merging portfolios cannot increase risk.

VaR satisfies homogeneity, monotonicity, and risk-free condition but is not sub-

additive. See Artzner, et al. (1999) for details. As advocated by Artzner, et al. (1999), ES

is preferred due to its better properties although VaR is widely used in applications.

Measures of risk might depend on the state of the economy since economic and market

conditions vary from time to time. This requires risk managers should focus on the condi-

tional distributions of profit and loss, which take full account of current information about

the investment environment (macroeconomic and financial as well as political) in forecasting

future market values, volatilities, and correlations. As pointed out by Duffie and Singleton

(2003), not only are the prices of the underlying market indices changing randomly over

time, the portfolio itself is changing, as are the volatilities of prices, the credit qualities of

counterparties, and so on. On the other hand, one would expect the VaR to increase as the

past returns become very negative, because one bad day makes the probability of the next

Page 115: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 105

somewhat greater. Similarly, very good days also increase the VaR, as would be the case for

volatility models. Therefore, VaR could depend on the past returns in someway. Hence, an

appropriate risk analytical tool or methodology should be allowed to adapt to varying mar-

ket conditions and to reflect the latest available information in a time series setting rather

than the iid framework. Most of the existing risk management literature has concentrated on

unconditional distributions and the iid setting although there have been some studies on the

conditional distributions and time series data. For more background, see Chernozhukov and

Umanstev (2001), Cai (2002), Fan and Gu (2003), Engle and Manganelli (2004), Cai and

Xu (2005), Scaillet (2005), and Cosma, Scaillet and von Sachs (2006), and references therein

for conditional models, and Duffie and Pan (1997), Artzner, et al. (1999), Rockafellar and

Uryasev (2000), Acerbi and Tasche (2002), Frey and McNeil (2002), Scaillet (2004), Chen

and Tang (2005), Chen (2006), and among others for unconditional models. Also, most of

studies in the literature and applications are limited to parametric models, such as all stan-

dard industry models like CreditRisk+, CreditMetrics, CreditPortfolio View and the model

proposed by the KMV corporation. See Chernozhukov and Umanstev (2001), Frey and Mc-

Neil (2002), Engle and Manganelli (2004), and references therein on parametric models in

practice and Fan and Gu (2003) and references therein for semiparametric models.

The main focus of this paper is on studying the conditional value-at-risk (CVaR) and

conditional expected shortfall (CES) and proposing a new nonparametric estimation proce-

dure to estimate CVaR and CES functions where the conditional information is allowed to

contain economic and market (exogenous) variables and the past observed returns. Paramet-

ric models for CVaR and CES can be most efficient if the underlying functions are correctly

specified. See Chernozhukov and Umanstev (2001) for a polynomial type regression model

and Engle and Manganelli (2004) for a GARCH type parametric model for CVaR based

on regression quantile. However, a misspecification may cause serious bias and model con-

straints may distort the underlying distributions. A nonparametric modeling is appealing

in several aspects. One of the advantages for nonparametric modeling is that little or no re-

strictive prior information on functionals is needed. Further, it may provide a useful insight

for further parametric fitting.

This paper has several contributions to the literature. The first one is to propose a new

nonparametric approach to estimate CVaR and CES. In essence, our estimator for CVaR is

based on inverting a newly proposed estimator of the conditional distribution function for

Page 116: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 106

time series data and the estimator for CES is by a plugging-in method based on plugging

in the estimated conditional probability density function and the estimated CVaR func-

tion. Note that they are analogous to the estimators studied by Scaillet (2005) by using

the Nadaraya-Watson (NW) type double kernel (smoothing in both the y and x directions)

estimation, and Cai (2002) by utilizing the weighted Nadaraya-Watson (WNW) kernel type

technique to avoid the so-called boundary effects as well as Yu and Jones (1998) by employ-

ing the double kernel local linear method. More precisely, our newly proposed estimator

combines the WNW method of Cai (2002) and the double kernel local linear technique of

Yu and Jones (1998), termed as weighted double kernel local linear (WDKLL) estimator.

The second contribution is to establish the asymptotic properties for the WDKLL esti-

mators of the conditional probability density function (PDF) and cumulative distribution

function (CDF) for the α-mixing time series at both boundary and interior points. It is

therefore shown that the WDKLL method enjoys the same convergence rates as those of the

double kernel local linear estimator of Yu and Jones (1998) and the WNW estimator of Cai

(2002). It is also shown that the WDKLL estimators have desired sampling properties at

both boundary and interior points of the support of the design density, which seems to be

seminal. Finally, we derive the WDKLL estimator of CVaR by inverting the WDKLL condi-

tional distribution estimator and the WDKLL estimator of CES by plugging in the WDKLL

estimators of PDF and CVaR. We show that the WDKLL estimator of CVaR exists always

due to the WDKLL estimator of CDF being a distribution function itself, and that it inherits

all better properties from the WDKLL estimator of CDF; that is, the WDKLL estimator of

CDF is a CDF and differentiable, and it possess the asymptotic properties such as design

adaption, avoiding boundary effects, and mathematical efficiency. Note that to preserve

shape constraints, recently, Cosma, Scaillet and von Sachs (2006) used a wavelet method to

estimate conditional probability density and cumulative distribution functions and then to

estimate conditional quantiles.

Note that CVaR defined here is essentially the conditional quantile or quantile regression

of Koenker and Bassett (1978), based on the conditional distribution, rather than CVaR

defined in some risk management literature (see, e.g., Rockafellar and Uryasev, 2000; Jorion,

2001, 2003) which is what we call ES here. Also, note that the ES here is called TailVaR in

Artzner, et al. (1999). Moreover, as aforementioned, CVaR can be regarded as a special case

of quantile regression. See Cai and Xu (2005) for the state-of-the-art about current research

Page 117: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 107

on nonparametric quantile regression, including CVaR. Further, note that both ES and CES

have been known for decades among actuary sciences and they are very popular in insurance

industry. Indeed, they have been used to assess risk on a portfolio of potential claims, and to

design reinsurance treaties. See the book by Embrechts, Kluppelberg, and Mikosch (1997)

for the excellent review on this subject and the papers by McNeil (1997), Hurlimann (2003),

Scaillet (2005), and Chen (2006). Finally, ES or CES is also closely related to other applied

fields such as the mean residual life function in reliability and the biometric function in

biostatistics. See Oakes and Dasu (1990) and Cai and Qian (2000) and references therein.

The rest of the paper is organized as follows. Section 8.4.2 provides a blueprint for

the basic notations and concepts. In Section 8.4.3, we present the detailed motivations and

formulations for the new nonparametric estimation procedures for estimating the conditional

PDF, CDF, VaR and ES. We establish the asymptotic properties of these nonparametric

estimators at both boundary and interior points with a comparison in Section 4. Together

with a convenient and efficient data-driven method for selecting the bandwidth based on

the nonparametric Akaike information criterion (AIC), Monte Carol simulation studies and

empirical applications on several stock index and stock returns are presented in Section

5. Finally, the derivations of the theorems are given in Section 6 with some lemmas and

the Appendix contains the technical proofs of certain lemmas needed in the proofs of the

theorems presented in Section 6.

8.4.2 Framework

Assume that the observed data (Xt, Yt); 1 ≤ t ≤ n, Xt ∈ ℜd, are available and they are

observed from a stationary time series model. Here Yt is the risk or loss variable which can

be the negative logarithm of return (log loss) and Xt is allowed to include both economic and

market (exogenous) variables and the lagged variables of Yt and also it can be a vector. But,

for the expositional purpose, we consider only the case when Xt is a scalar (d = 1). Note that

the proposed methodologies and their theory for the univariate case (d = 1) continue to hold

for multivariate situations (d > 1). Extension to the case d > 1 involves no fundamentally

new ideas. Note that models with large d are often not practically useful due to “curse of

dimensionality”.

We now turn to considering the nonparametric estimation of the conditional expected

Page 118: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 108

shortfall µp(x), which is defined as

µp(x) = E[Yt |Yt ≥ νp(x), Xt = x],

where νp(x) is the conditional value-at-risk, which is defined as the solution of

P (Yt ≥ νp(x) |Xt = x) = S(νp(x) |x) = p

or expressed as νp(x) = S−1(p |x), where S(y |x) is the conditional survival function of Yt

given Xt = x; S(y |x) = 1− F (y |x), and F (y |x) is the conditional cumulative distribution

function. It is easy to see that

µp(x) =

∫ ∞

νp(x)

y f(y |x) dy/p,

where f(y |x) is the conditional probability density function of Yt given Xt = x. To estimate

µp(x), one can use the plugging-in method as

µp(x) =

∫ ∞bνp(x)

y f(y |x) dy/p, (8.1)

where νp(x) is a nonparametric estimation of νp(x) and f(y |x) is a nonparametric estimation

of f(y |x). But the bandwidths for νp(x) and f(y |x) are not necessary to be same.

Note that Scaillet (2005) used the NW type double kernel method to estimate f(y |x)first, due to Roussas (1969), denoted by f(y |x), and then estimated νp(x) by inverting

the estimated conditional survival function, denoted by νp(x), and finally estimated µp(x)

by plugging f(y |x) and νp(x) into (8.1), denoted by µp(x), where νp(x) = S−1(y |x) and

S(y |x) =∫∞

yf(u |x)du. But, it is well documented (see, e.g., Fan and Gijbels, 1996) that the

NW kernel type procedures have serious drawbacks: the asymptotic bias involves the design

density so that they can not be adaptive, and boundary effects exist so that they require

boundary modifications. In particular, boundary effects might cause a serious problem for

estimating νp(x) since it is only concerned with the tail probability. The question is now

how to provide a better estimate for f(y |x) and νp(x) so that we have a good estimate for

µp(x). Therefore, we address this issue in the next section.

8.4.3 Nonparametric Estimating Procedures

We start with the nonparametric estimators for the conditional density function and its

distribution function first and then turn to discussing the nonparametric estimators for the

conditional VaR and ES functions.

Page 119: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 109

There are several methods available for estimating νp(x), f(y |x), and F (y |x) in the

literature, such as kernel and nearest-neighbor. To name just a few, see Lejeune and Sarda

(1988), Troung (1989), Samanta (1989), and Chaudhuri (1991) for iid errors, Roussas (1969)

and Roussas (1991) for Markovian processes, and Troung and Stone (1992) and Boente

and Fraiman (1995) for mixing sequences. To attenuate these drawbacks of the kernel type

estimators mentioned in Section 8.4.2, recently, some new methods have been proposed to

estimate conditional quantiles. The first one, a more direct approach, by using the “check”

function such as the robustified local linear smoother, was provided by Fan, Hu, and Troung

(1994) and further extended by Yu and Jones (1997, 1998) for iid data. A more general

nonparametric setting was explored by Cai and Xu (2005) for time series data. This modeling

idea was initialed by Koenker and Bassett (1978) for linear regression quantiles and Fan, Hu,

and Troung (1994) for nonparametric models. See Cai and Xu (2005) and references therein

for more discussions on models and applications. An alternative procedure is first to estimate

the conditional distribution function by using double kernel local linear technique of Fan,

Yao, and Tong (1996) and then to invert the conditional distribution estimator to produce

an estimator of a conditional quantile or CVaR. Yu and Jones (1997, 1998) compared these

two methods theoretically and empirically and suggested that the double kernel local linear

would be better.

Estimation of Conditional PDF and CDF

To make a connection between the conditional density (distribution) function and nonpara-

metric regression problem, it is noted by the standard kernel estimation theory (see, e.g.,

Fan and Gijbles, 1996) that for a given symmetric density function K(·),

EKh0(y − Yt) |Xt = x = f(y |x) +

h20

2µ2(K) f 2,0(y |x) + o(h2

0) ≈ f(y |x), as h0 → 0,

(8.2)

where Kh0(u) = K(u/h0)/h0, µ2(K) =

∫∞

−∞u2K(u)du, f 2,0(y |x) = ∂2/∂y2f(y |x), and ≈

denotes an approximation by ignoring the higher terms. Note that Y ∗t (y) = Kh0

(y− Yt) can

be regarded as an initial estimate of f(y |x) smoothing in the y direction. Also, note that

this approximation ignores the higher order terms O(hj0) for j ≥ 2, since they are negligible

if h0 = o(h), where h is the bandwidth used in smoothing in the x direction (see (8.3) below).

Therefore, the smoothing in the y direction is not important in the context of this subject

so that intuitively, it should be under-smoothed. Thus, the left hand side of (8.2) can be

Page 120: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 110

regraded as a nonparametric regression of the observed variable Y ∗t (y) versus Xt and the

local linear (or polynomial) fitting scheme of Fan and Gijbles (1996) can be applied to here.

This leads us to consider the following locally weighted least squares regression problem:

n∑

t=1

Y ∗t (y) − a− b (Xt − x)2 Wh(x−Xt), (8.3)

where W (·) is a kernel function and h = h(n) > 0 is the bandwidth satisfying h → 0 and

nh→ ∞ as n → ∞, which controls the amount of smoothing used in the estimation. Note

that (8.3) involves two kernels K(·) and W (·). This is the reason of calling “double kernel”.

Minimizing the above locally weighted least squares in (8.3) with respect to a and b, we

obtain the locally weighted least squares estimator of f(y |x), denoted by f(y |x), which is

a. From Fan and Gijbels (1996) or Fan, Yao and Tong (1996), f(y |x) can be re-expressed

as a linear estimator form as

fll(y |x) =n∑

t=1

Wll,t(x, h)Y∗t (y),

where with Sn,j(x) =∑n

t=1Wh(x−Xt) (Xt − x)j, the weights Wll,t(x, h) are given by

Wll,t(x, h) =[Sn,2(x) − (x−Xt)Sn,1(x)]Wh(x−Xt)

Sn,0(x)Sn,2(x) − S2n,1(x)

.

Clearly, Wll,t(x, h) satisfy the so-called discrete moments conditions as follows: for 0 ≤j ≤ 1,

n∑

t=1

Wll,t(x, h) (Xt − x)j = δ0,j =

1 if j = 00 otherwsie

(8.4)

based on the least squares theory; see (3.12) of Fan and Gijbels (1996, p.63). Note that the

estimator fll(y |x) can range outside [0, ∞). The double kernel local linear estimator of

F (y |x) is constructed (see (8) of Yu and Jones (1998)) by integrating fll(y |x)

Fll(y |x) =

∫ y

−∞

fll(y |x)dy =n∑

t=1

Wll,t(x, h)Gh0(y − Yt),

where G(·) is the distribution function of K(·) and Gh0(u) = G(u/h0). Clearly, Fll(y |x)

is continuous and differentiable with respect to y with Fll(−∞|x) = 0 and Fll(∞|x) = 1.

Note that the differentiability of the estimated distribution function can make the asymptotic

analysis much easier for the nonparametric estimators of CVaR and CES (see later).

Page 121: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 111

Although Yu and Jones (1998) showed that the double kernel local linear estimator has

some attractive properties such as no boundary effects, design adaptation, and mathematical

efficiency (see, e.g., Fan and Gijbels, 1996), it has the disadvantage of producing conditional

distribution function estimators that are not constrained either to lie between zero and one

or to be monotone increasing, which is not good for estimating CVaR if the inverting method

is used. In both these respects, the NW method is superior, despite its rather large bias

and boundary effects. The properties of positivity and monotonicity are particularly advan-

tageous if the method of inverting conditional distribution estimator is applied to produce

the estimator of a conditional quantile or CVaR. To overcome these difficulties, Hall, Wolff,

and Yao (1999) and Cai (2002) proposed the WNW estimator based on an empirical likeli-

hood principle, which is designed to possess the superior properties of local linear methods

such as bias reduction and no boundary effects, and to preserve the property that the NW

estimator is always a distribution function, although it might require more computational

efforts since it requires estimating and optimizing additional weights aimed at the bias cor-

rection. Cai (2002) discussed the asymptotic properties of the WNW estimator at both

interior and boundary points for the mixing time series under some regularity assumptions

and showed that the WNW estimator has a better performance than other competitors. See

Cai (2002) for details. Recently, Cosma, Scaillet and von Sachs (2006) proposed a shape

preserving estimation method to estimate cumulative distribution functions and probability

density functions using the wavelet methodology for multivariate dependent data and then

to estimate a conditional quantile or CVaR.

The WNW estimator of the conditional distribution F (y |x) of Yt given Xt = x is defined

by

Fc1(y |x) =n∑

t=1

Wc,t(x, h) I(Yt ≤ y), (8.5)

where the weights Wc,t(x, h) are given by

Wc,t(x, h) =pt(x)Wh(x−Xt)∑nt=1 pt(x)Wh(x−Xt)

, (8.6)

and pt(x) is chosen to be pt(x) = n−1 1 + λ (Xt − x)Wh(x−Xt)−1 ≥ 0 with λ, a

function of data and x, uniquely defined by maximizing the logarithm of the empirical

likelihood

Ln(λ) = −n∑

t=1

log 1 + λ (Xt − x)Wh(x−Xt)

Page 122: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 112

subject to the constraints∑n

t=1 pt(x) = 1 and the discrete moments conditions in (8.4); that

is,n∑

t=1

Wc,t(x, h) (Xt − x)j = δ0,j (8.7)

for 0 ≤ j ≤ 1. Also, see Cai (2002) for details on this aspect. In implementation, Cai (2002)

recommended using the Newton-Raphson scheme to find the root of equation L′n(λ) = 0.

Note that 0 ≤ Fc1(y |x) ≤ 1 and it is monotone in y. But Fc1(y |x) is not continuous in y

and of course, not differentiable in y either. Note that under regression setting, Cai (2001)

provided a comparison of the local linear estimator and the WNW estimator and discussed

the asymptotic minimax efficiency of the WNW estimator.

To accommodate all nice properties (monotonicity, continuity, differentiability, and lying

between zero and one) and the attractive asymptotic properties (design adaption, avoiding

boundary effects, and mathematical efficiency, see Cai (2002) for detailed discussions) of

both estimators Fll(y |x) and Fc1(y |x) under a unified framework, we propose the following

nonparametric estimators for the conditional density function f(y |x) and its conditional

distribution function F (y |x), termed as weighted double kernel local linear estimation,

fc(y |x) =n∑

t=1

Wc,t(x, h)Y∗t (y),

where Wc,t(x, h) is given in (8.6), and

Fc(y |x) =

∫ y

−∞

fc(y |x)dy =n∑

t=1

Wc,t(x, h)Gh0(y − Yt). (8.8)

Note that if pt(x) in (8.6) is a constant for all t, or λ = 0, then fc(y |x) becomes the classical

NW type double kernel estimator used by Scaillet (2005). However, Scaillet (2005) adopted

a single bandwidth for smoothing in both the y and x directions. Clearly, fc(y |x) is a

probability density function so that Fc(y |x) is a cumulative distribution function (monotone,

0 ≤ Fc(y |x) ≤ 1, Fc(−∞|x) = 0, and Fc(∞|x) = 1). Also, Fc(y |x) is continuous and

differentiable in y. Further, as expected, it will be shown that like Fc1(y |x), Fc(y |x) has

the attractive properties such as no boundary effects, design adaptation, and mathematical

efficiency.

Estimation of Conditional VaR and ES

We now are ready to formulate the nonparametric estimators for νp(x) and µp(x). To this

end, from (8.8), νp(x) is estimated by inverting the estimated conditional survival distribution

Page 123: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 113

Sc(y |x) = 1− Fc(y |x), denoted by νp(x) and defined as νp(x) = S−1c (p |x). Note that νp(x)

always exists since Sc(p |x) is a survival function itself. Plugging-in νp(x) and fc(y |x) into

(8.1), we obtain the nonparametric estimation of µp(x),

µp(x) = p−1

∫ ∞bνp(x)

y fc(y |x) dy = p−1

n∑

t=1

Wc,t(x, h)

∫ ∞bνp(x)

y Kh0(y − Yt)dy

= p−1

n∑

t=1

Wc,t(x, h)[Yt Gh0

(νp(x) − Yt) + h0G1,h0(νp(x) − Yt)

], (8.9)

where G(u) = 1 − G(u), G1,h0(u) = G1(u/h0), and G1(u) =

∫∞

uv K(v)dv. Note that as

mentioned earlier, νp(x) in (8.9) can be an any consistent estimator.

8.4.4 Real Examples

Example 1. Now we illustrate our proposed methodology by considering a real data set on

Dow Jones Industrials (DJI) index returns. We took a sample of 1801 daily prices from DJI

index, from November 3, 1998 to January 3, 2006, and computed the daily returns as 100

times the difference of the log of prices. Let Yt be the daily negative log return (log loss) of

DJI and Xt be the first lagged variable of Yt. The estimators proposed in this paper are used

to estimate the 5% CVaR and CES functions. The estimation results are shown in Figure 8.1

for the 5% CVaR estimate in Figure 8.1(a) and the 5% CES estimate in Figure 8.1(b). Both

CVaR and CES estimates exhibit a U-shape, which corresponds to the so-called “volatility

smile”. Therefore, the risk tends to be lower when the lagged log loss of DJI is close to the

empirical average and larger otherwise. We can also observe that the curves are asymmetric.

This may indicate that the DJI is more likely to fall down if there was a loss within the last

day than there was a same amount positive return.

Example 2. We apply the proposed methods to estimate the conditional value-at-risk and

expected shortfall of the International Business Machine Co. (NYSE: IBM) security returns.

The data are daily prices recorded from March 1, 1996 to April 6, 2005. We use the same

method to calculate the daily returns as in Example 3. In order to estimate the value-

at-risk of a stock return, generally, the information set Xt may contain a market index of

corresponding capitalization and type, the industry index, and the lagged values of stock

return. For this example, Yt is the log loss of IBM stock returns and only two variables are

chosen as information set for the sake of simplicity. Let Xt1 be the first lagged variable of

Yt and Xt2 denote the first lagged daily log loss of Dow Jones Industrials (DJI) index. Our

Page 124: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 114

−1.0 −0.5 0.0 0.5 1.0

1.60

1.65

1.70

1.75

1.80

1.85

1.90

(a) Conditional VaR

−1.0 −0.5 0.0 0.5 1.0

2.2

2.3

2.4

2.5

2.6

(b) Conditional Es

Figure 8.1: (a) 5% CVaR estimate for DJI index. (b) 5% CES estimate for DJI index.

main results from the estimation of the model are summarized in Figure 8.2. The surfaces

of the estimators of IBM returns are given in Figure 8.2(a) for CVaR and in Figure 8.2(b)

for CES. For visual convenience, Figures 8.2(c) and (e) depict the estimated CVaR and CES

curves (as function of Xt2) for three different values of Xt1 = (−0.275, −0.025, 0.325) and

Figures 8.2(d) and (f) display the estimated CVaR and CES curves (as function of Xt1) for

three different values of Xt2 = (−0.225, 0.025, 0.425).

From Figures 8.2(c) - (f), we can observe that most of these curves are U-shaped. This is

consistent with the results observed in Example 1. Also, we can see that these three curves

in each figure are not parallel. This implies that the effects of lagged IBM and lagged DJI

variables on the risk of IBM are different and complex. To be concrete, let us examine Figure

8.2(d). Three curves are close to each other when the lagged IBM log loss is around −0.2

and far away otherwise. This implies that DJI has fewer effects (less information) on CVaR

around this value. Otherwise, DJI has more effects when the lagged IBM log loss is far from

this value.

8.5 References

Acerbi, C. and D. Tasche (2002). On the coherence of expected shortfall. Journal ofBanking and Finance, 26, 1487-1503.

Page 125: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 115

IBM

−0.4

−0.2

0.00.2

0.4

DJI

−0.4

−0.2

0.0

0.20.4

2.62.7

2.8

2.9

(a) Conditional VaR surface

IBM

−0.4

−0.2

0.00.2

0.4

DJI

−0.4

−0.2

0.0

0.20.4

3.84.04.24.44.6

(b) Conditional ES surface

−0.4 −0.2 0.0 0.2 0.42.6

02

.70

2.8

02

.90

(c) Conditional VaR

x1=−0.275x1=−0.025x1=0.350

−0.4 −0.2 0.0 0.2 0.4

2.6

2.7

2.8

2.9

(d) Conditional VaR

x2=−0.225x2=0.025x2=0.425

−0.4 −0.2 0.0 0.2 0.43.7

3.8

3.9

4.0

4.1

4.2

(e) Conditional ES

x1=−0.275x1=−0.025x1=0.350

−0.4 −0.2 0.0 0.2 0.4

3.8

4.0

4.2

4.4

4.6

(f) Conditional ES

x2=−0.225x2=0.025x2=0.425

Figure 8.2: (a) 5% CVaR estimates for IBM stock returns. (b) 5% CES estimates for IBMstock returns index. (c) 5% CVaR estimates for three different values of lagged negativeIBM returns (−0.275, −0.025, 0.325). (d) 5% CVaR estimates for three different values oflagged negative DJI returns (−0.225, 0.025, 0.425). (e) 5% CES estimates for three differentvalues of lagged negative IBM returns (−0.275, −0.025, 0.325). (f) 5% CES estimates forthree different values of lagged negative DJI returns (−0.225, 0.025, 0.425).

Page 126: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 116

Artzner, P., F. Delbaen, J.M. Eber, and D. Heath (1999). Coherent measures of risk.Mathematical Finance, 9, 203-228.

Boente, G. and R. Fraiman (1995). Asymptotic distribution of smoothers based on localmeans and local medians under dependence. Journal of Multivariate Analysis, 54,77-90.

Cai, Z. (2001). Weighted Nadaraya-Watson regression estimation. Statistics and ProbabilityLetters, 51, 307-318.

Cai, Z. (2002). Regression quantiles for time series data. Econometric Theory, 18, 169-192.

Cai, Z. (2006). Trending time varying coefficient time series models with serially correlatederrors. Journal of Econometrics, 136, xxx-xxx.

Cai, Z. and L. Qian (2000). Local estimation of a biometric function with covariate effects.In Asymptotics in Statistics and Probability (M. Puri, ed), 47-70.

Cai, Z. and R.C. Tiwari (2000). Application of a local linear autoregressive model to BODtime series. Environmetrics, 11, 341-350.

Cai, Z. and X. Xu (2005). Nonparametric quantile estimations for dynamic smooth coeffi-cient models. Working Paper, Department of Mathematics and Statistics, Universityof North Carolina at Charlotte. Revised for Journal of the American Statistical Asso-ciation.

Carrasco, M. and X. Chen (2002). Mixing and moments properties of various GARCH andstochastic volatility models. Econometric Theory, 18, 17-39.

Chaudhuri, P. (1991). Nonparametric estimates of regression quantiles and their localBahadur representation. The Annals of statistics, 19, 760-777.

Chen, S.X. (2006). Nonparametric estimation of expected shortfall. Working paper, De-partment of Statistics, Iowa State University.

Chen, S.X. and C.Y. Tang (2005). Nonparametric inference of value at risk for dependentfinancial returns. Journal of Financial Econometrics, 3, 227-255.

Chernozhukov, V. and L. Umanstev (2001). Conditional value-at-risk: Aspects of modelingand estimation. Empirical Economics, 26, 271-292.

Cosma, A., O. Scaillet and R. von Sachs (2006). Multivariate wavelet-based shape preserv-ing estimation for dependent observations. Bernoulli, in press.

Crouhy, M., D. Galai and R. Mark (2001). Risk Management. McGraw-Hill, New York.

Duffie, D. and J. Pan (1997). An overview of value at risk. Journal of Derivatives, 4, 7-49.

Duffie, D. and K.J. Singleton (2003). Credit Risk: Pricing, Measurement, and Management.Princeton: Princeton University Press.

Page 127: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 117

Eberlein, E. and U. Keller (1995). Hyperbolic distributions in finance. Bernoulli, 281-299.

Embrechts, P., C. Kluppelberg, and T. Mikosch (1997). Modeling Extremal Events ForFinance and Insurance. New York: Springer-Verlag.

Engle, R.F. and S. Manganelli (2004). CAViaR: conditional autoregressive value at risk byregression quantile. Journal of Business and Economics Statistics, 22, 367-381.

Fan, J. and I. Gijbels (1996). Local Polynomial Modeling and Its Applications. London:Chapman and Hall.

Fan, J. and J. Gu (2003). Semiparametric estimation of value-at-risk. Econometrics Jour-nal, 6, 261-290.

Fan, J., T.-C. Hu and Y.K. Troung (1994). Robust nonparametric function estimation.Scandinavian Journal of Statistics, 21, 433-446.

Fan, J., Q. Yao, and H. Tong (1996). Estimation of conditional densities and sensitivitymeasures in nonlinear dynamical systems. Biometrika, 83, 189-206.

Frey, R. and A.J. McNeil (2002). VaR and expected shortfall in portfolios of dependentcredit risks: conceptual and practical insights. Journal of Banking and Finance, 26,1317-1334.

Hall, P., R.C.L. Wolff, and Q. Yao (1999). Methods for estimating a conditional distributionfunction. Journal of the American Statistical Association, 94, 154-163.

Hurlimann, W. (2003). A Gaussian exponential approximation to some compound Poissondistributions. ASTIN Bulletin, 33, 41-55.

Ibragimov, I.A. and Yu. V. Linnik (1971). Independent and Stationary Sequences of Ran-dom Variables. Groningen, the Netherlands: Walters-Noordhoff.

Jorion, P. (2001). Value at Risk, 2nd Edition. New York: McGraw-Hill.

Jorion, P. (2003). Financial Risk Manager Handbook, 2nd Edition. New York: John Wiley.

Koenker, R. and G.W. Bassett (1978). Regression quantiles. Econometrica, 46, 33-50.

Lejeune, M.G. and P. Sarda (1988). Quantile regression: a nonparametric approach. Com-putational Statistics and Data Analysis, 6, 229-281.

Masry, E. and J. Fan (1997). Local polynomial estimation of regression functions for mixingprocesses. The Scandinavian Journal of Statistics, 24, 165-179.

McNeil, A. (1997). Estimating the tails of loss severity distributions using extreme valuetheory. ASTIN Bulletin, 27, 117-137.

Morgan, J.P. (1996). Risk Metrics - Technical Documents, 4th Edition, New York.

Oakes, D. and T. Dasu (1990). A note on residual life. Biometrika, 77, 409-410.

Page 128: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 8. VALUE AT RISK 118

Potscher, B.M. and I.R. Prucha (1997). Dynamic Nonlinear Econometric Models: Asymp-totic Theory. Berlin: Springer-Verlag.

Rockafellar, R. and S. Uryasev (2000). Optimization of conditional value-at-risk. Journalof Risk, 2, 21-41.

Roussas, G.G. (1969). Nonparametric estimation of the transition distribution function ofa Markov process. The Annals of Mathematical Statistics, 40, 1386-1400.

Roussas, G.G. (1991). Estimation of transition distribution function and its quantiles inMarkov processes: Strong consistency and asymptotic normality. In G.G. Roussas(ed.), Nonparametric Functional Estimation and related Topics, pp. 443-462. Amster-dam: Kluwer Academic.

Samanta, M. (1989). Nonparametric estimation of conditional quantiles. Statistics andProbability Letters, 7, 407-412.

Scaillet, O. (2004). Nonparametric estimation and sensitivity analysis of expected shortfall.Mathematical Finance, 14, 115-129.

Scaillet, O. (2005). Nonparametric estimation of conditional expected shortfall. RevueAssurances et Gestion des Risques/Insurance and Risk Management Journal, 74, 639-660.

Troung, Y.K. (1989). Asymptotic properties of kernel estimators based on local median.The Annals of Statistics, 17, 606-617.

Troung, Y.K. and C.J. Stone (1992) Nonparametric function estimation involving timeseries. The Annals of Statistics 20, 77-97.

Tsay, R.S. (2005). Analysis of Financial Time Series, 2th Edition. John Wiley & Sons,New York.

Yu, K. and M.C. Jones (1997). A comparison of local constant and local linear regressionquantile estimation. Computational Statistics and Data Analysis, 25, 159-166.

Yu, K. and M.C. Jones (1998). Local linear quantile regression. Journal of the AmericanStatistical Association, 93, 228-237.

Page 129: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

Chapter 9

Long Memory Models and StructuralChanges

Long memory time series have been a popular area of research in economics, finance and

statistics and other applied fields such as hydrological sciences during the recent years. Long

memory dependence was first observed by the hydrologist Hurst (1951) when analyzing the

minimal water flow of the Nile River when planning the Aswan Dam. Granger (1966) gave

an intensive discussion about the application of long memory dependence in economics and

its consequence was initiated. But in many applications, it is not clear whether the observed

dependence structure is real long memory or an artefact of some other phenomenon such

as structural breaks or deterministic trends. Long memory in the data would have strong

consequences.

9.1 Long Memory Models

9.1.1 Methodology

We have discussed that for a stationary time series the ACF decays exponentially to zero as

lag increases. Yet for a unit root nonstationary time series, it can be shown that the sample

ACF converges to 1 for all fixed lags as the sample size increases; see Chan and Wei (1988)

and Tiao and Tsay (1983). There exist some time series whose ACF decays slowly to zero at

a polynomial rate as the lag increases. These processes are referred to as long memory or

long range dependent time series. One such an example is the fractionally differenced

process defined by

(1 − L)d xt = wt, |d| < 0.5, (9.1)

119

Page 130: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 120

where wt is a white noise series and d is called the long memory parameter or H = d+1/2

is called the Hurst parameter; see Hurst (1951). Properties of model (9.1) have been widely

studied in the literature (e.g., Beran, 1994). We summarize some of these properties below.

1. If d < 0.5, then xt is a weakly stationary process and has the infinite MA representation

xt = wt +∞∑

k=1

ψk wt−k with ψk = d(d+ 1) · · · (d+ k − 1)/k! =

(k + d− 1

k

).

2. If d > −0.5, then xt is invertible and has the infinite AR representation.

xt = wt +∞∑

k=1

ψk wt−k with ψk = (0 − d)(1 − d) · · · (k − 1 − d)/k! =

(k − d− 1

k

).

3. For |d| < 0.5, the ACF of xt is

ρx(h) =d(1 + d) · · · (h− 1 + d)

(1 − d)(2 − d) · · · (h− d), h ≥ 1.

In particular, ρx(1) = d/(1 − d) and as h→ ∞,

ρx(h) ≈(−d)!

(d− 1)!h2d−1.

4. For |d| < 0.5, the PACF of xt is φh,h = d/(h− d) for h ≥ 1.

5. For |d| < 0.5, the spectral density function fx(·) of xt, which is the Fourier transform

of the ACF γx(h) of xt, that is

fx(ν) =1

∞∑

h=−∞

γx(h) exp(−i h π ν)

for ν ∈ [−1, 1], where i =√−1, satisfies

fx(ν) ∼ ν−2d as ν → 0, (9.2)

where ν ∈ [0, 1] denotes the frequency.

See the books by Hamilton (1994) and Brockwell and Davis (1991) for details about the

spectral analysis. The basic idea and properties of the spectral density and its estimation

are discussed in the section.

Of particular interest here is the behavior of ACF of xt when d < 0.5. The property

says that ρx(h) ∼ h2d−1, which decays at a polynomial, instead of exponential rate. For

Page 131: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 121

this reason, such an xt process is called a long-memory time series. A special characteristic

of the spectral density function in (9.2) is that the spectrum diverges to infinity as ν → 0.

However, the spectral density function of a stationary ARMA process is bounded for all

ν ∈ [−1, 1].

Earlier we used the binomial theorem for non-integer powers

(1 − L)d =∞∑

k=0

(−1)k

(d

k

)Lk.

If the fractionally differenced series (1 − L)d xt follows an ARMA(p, q) model, then xt is

called an fractionally differenced autoregressive moving average (ARFIMA(p, d, q)) process,

which is a generalized ARIMA model by allowing for non-integer d. In practice, if the sample

ACF of a time series is not large in magnitude, but decays slowly, then the series may have

long memory. For more discussions, we refer to the book by Beran (1994). For the pure

fractionally differenced model in (9.1), one can estimate d using either a maximum likelihood

method in the time domain (by assuming that the distribution is known) or the approxi-

mate Whittle likelihood (see below) or a regression method with logged periodogram at the

lower frequencies (using (9.2)) in the frequency domain. Finally, long-memory models have

attracted some attention in the finance literature in part because of the work on fractional

Brownian motion in the continuous time models.

9.1.2 Spectral Density

We define the basic statistics used for detecting periodicities in time series and for deter-

mining whether different time series are related at certain frequencies. The power spectrum

is a measure of the power or variance of a time series at a particular frequency ν; the func-

tion that displays the power or variance for each frequency ν, say fx(ν) is called the power

spectral density. Although the function fx(·) is the Fourier transform of the autocovariance

function γx(h), we shall not make much use of this theoretical result in our generally applied

discussion. For the theoretical results, please read the book by Brockwell and Davis (1991,

§10.3).

The discrete Fourier transform of a sampled time series xtn−1t=0 is defined as

X(νk) =1√n

n−1∑

t=0

xt exp−2π νk t, (9.3)

Page 132: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 122

where νk = k/n, k = 0, . . . , n − 1, defines the set of frequencies over which (9.3) is com-

puted. This means that the frequencies over which (9.3) is evaluated are of the form

ν = 0, 1/n, 2/n, . . . , (n − 1)/n. The evaluation of (9.3) proceeds using the fast Fourier

transform (DFT) and usually assumes that the length of the series, n is some power of 2.

If the series is not a power of 2, it can be padded by adding zeros so that the extended

series corresponds to the next highest power of two. One might want to do this anyway if

frequencies of the form νk = k/n are not close enough to the frequencies of interest.

Since, the values of (9.3) have a real and imaginary part at every frequency and can be

positive or negative, it is conventional to calculate first the squared magnitude of (9.3) which

is called the periodogram. The periodogram is defined as

Px(νk) = |X(νk)|2 (9.4)

and is just the sum of the squares of the sine and cosine transforms of the series. If we

consider the power spectrum fx(νk) as the quantity to estimate, it is approximately true1;

see the books by Hamilton (1994) and Brockwell and Davis (1991, §10.3), that

E[P (νk)] = E[|X(νk)|2

]≈ fx(νk), (9.5)

which implies that the periodogram is an approximately unbiased estimator of the power

spectrum at frequency νk. One may also show that the approximate covariance between two

frequencies νk and νl is zero for k 6= l and both frequencies multiples of 1/n. One can also

show that the sine and cosine parts of (9.3) are uncorrelated and approximately normally

distributed with equal variances fx(νk)/2. This implies that the squared standardized real

and imaginary components have chi-squared distributions with 1 degree of freedom each.

The periodogram is the sum of two such squared variables and must then have a chi-squared

distribution with 2 degrees of freedom. In other words, the approximate distribution of

P (νk) is exponential with the parameter λk = f(νk). The Whittle likelihood is based on the

approximate exponential distribution of P (νk)n−1k=0 .

It should be noted that trends introduce apparent long range periodicities which inflate

the spectrum at low frequencies (small values of νk) since the periodogram assumes that

the trend is part of one very long frequency. This behavior will obscure possible important

1For Ph.D. students, I encourage you to read the related book to understand the details on this aspect,which is important for your future research.

Page 133: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 123

components at higher frequencies. For this reason (as with the auto and cross correlations) it

is important to work in (9.3) with the transform of the mean-adjusted (xt − x) or detrended

(xt − a − b t) series. A further modification that sometimes improves the approximation in

the expectation in (9.5) is tapering, a procedure where by each point xt is replaced by at xt,

where at is a function, often a cosine bell, that is maximum at the midpoint of the series and

decays away at the extremes. Tapering makes a difference in engineering applications where

there are large variations in the value of the spectral density fx(·). As a final comment,

we note that the fast Fourier transform algorithms for computing (9.3) generally work best

when the sample length is some power of two. For this reason, it is conventional to extend

the data series xt, t = 0, 1, . . . , n− 1 to a length n∗ > n that is a power of two by replacing

the missing values by xt = 0, t = n+ 1, . . . , n∗ − 1. If this is done, the frequency ordinates

νk = k/n are replaced by ν∗k = k/n∗ and we proceed as before.

Theorem 10.3.2 in Brockwell and Davis (1991, p.347) shows that P (νk) is not a con-

sistent estimator of fx(ν). Since for large n, the periodogram ordinates are approximately

uncorrelated with variances changing only slightly over small frequency intervals, we might

hope to construct a consistent estimator of fx(ν) by averaging the periodogram ordinates

in a small neighborhood of ν, which is called the smoothing estimator. On the other hand,

such a smoothing estimator can reduce the variability of the periodogram estimator of the

previous section. This procedure leads to a spectral estimator of the form

fx(νk) =1

2L+ 1

L∑

l=−L

Px(νk + l/n) =L∑

l=−L

wl Px(νk + l/n), (9.6)

when the periodogram is smoothed over 2L + 1 frequencies. The width of the interval

over which the frequency average is taken is called the bandwidth. Since there are 2L + 1

frequencies of width 1/n, the bandwidth B in this case is approximately B = (2L+1)/n. The

smoothing procedure improves the quality of the estimator for the spectrum since it is now

the average of L random variables each having a chi-squared distribution with 2 degrees of

freedom. The distribution of the smoothed estimator then will have df = 2(2L+ 1) degrees

of freedom. If the series is adjusted to the next highest power of 2, say n∗, the adjusted

degrees of freedom for the estimator will be df ∗ = 2(2L + 1)n/n∗ and the new bandwidth

will be B = (2L+ 1)/n∗.

Clearly, in (9.6), wl = 1/(2L+ 1), which is called the Daniell window (the corresponding

Page 134: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 124

spectral window is given by the Daniell kernel). One popular approach is to take wl to be

wl = w(νl)/n.

If w(x) = r−1n sin2(r x/2)/ sin2(x/2), it is the well known Barlett or triangular window (the

corresponding spectral window is given by the Fejer kernel), where rn = ⌊n/(2L)⌋. If w(x) =

sin((r+0.5)x)/ sin(x/2), it is the rectangular window (the corresponding spectral window is

given by the Dirichlet kernel). See Brockwell and Davis (1991, §10.4) for more discussions.

9.1.3 Applications

The usage of the function fracdiff() is

fracdiff(x, nar = 0, nma = 0,

ar = rep(NA, max(nar, 1)), ma = rep(NA, max(nma, 1)),

dtol = NULL, drange = c(0, 0.5), h, M = 100)

This function can be used to compute the maximum likelihood estimators of the parameters

of a fractionally-differenced ARIMA(p, d, q) model, together (if possible) with their estimated

covariance and correlation matrices and standard errors, as well as the value of the maximized

likelihood. The likelihood is approximated using the fast and accurate method of Haslett

and Raftery (1989). To generate simulated long-memory time series data from the fractional

ARIMA(p, d, q) model, we can use the following function fracdiff.sim() and its usage is

fracdiff.sim(n, ar = NULL, ma = NULL, d,

rand.gen = rnorm, innov = rand.gen(n+q, ...),

n.start = NA, allow.0.nstart = FALSE, ..., mu = 0.)

An alternative way to simulate a long memory time series is to use the function arima.sim().

The menu for the package fracdiff can be downloaded from the web site at

http://cran.cnr.berkeley.edu/doc/packages/fracdiff.pdf

The function spec.pgram() in R calculates the periodogram using a fast Fourier trans-

form, and optionally smooths the result with a series of modified Daniell smoothers (moving

averages giving half weight to the end values). The usage of this function is

spec.pgram(x, spans = NULL, kernel, taper = 0.1,

pad = 0, fast = TRUE, demean = FALSE, detrend = TRUE,

Page 135: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 125

plot = TRUE, na.action = na.fail, ...)

We can also use the function spectrum() to estimate the spectral density of a time series

and its usage is

spectrum(x, ..., method = c("pgram", "ar"))

Finally, it is worth to pointing out that there is a package called longmemo for long-memory

processes, which can be downloaded from

http://cran.cnr.berkeley.edu/doc/packages/longmemo.pdf. This package also pro-

vides a simple periodogram estimation by function per() and other functions like llplot()

and lxplot() for making graphs for spectral density. See the menu for details.

Example 9.1: As an illustration, Figure 9.1 show the sample ACFs of the absolute series

of daily simple returns for the CRSP value-weighted (left top panel) and equal-weighted

(right top panel) indexes from July 3, 1962 to December 31, 1997 and the sample partial

autocorrelation function of the absolute series of daily simple returns for the CRSP value-

weighted (left middle panel) and equal-weighted (right middle panel) indexes. The ACFs

are relatively small in magnitude, but decay very slowly; they appear to be significant at

the 5% level even after 300 lags. There are only the first few lags for PACFs outside the

confidence interval and then the rest is basically within the confidence interval. For more

information about the behavior of sample ACF of absolute return series, see Ding, Granger,

and Engle (1993). To estimate the long memory parameter estimate d, we can use the

function fracdiff() in the package fracdiff in R and results are d = 0.1867 for the absolute

returns of the value-weighted index and d = 0.2732 for the absolute returns of the equal-

weighted index. To support our conclusion above, we plot the log smoothed spectral density

estimation of the absolute series of daily simple returns for the CRSP value-weighted (left

bottom panel) and equal-weighted (right bottom panel). They show clearly that both log

spectral densities decay like a log function and they support the spectral densities behavior

like (9.2).

9.2 Related Problems and New Developments

The reading materials are the papers by Ding, Granger, and Engle (1993), Kramer, Sibbert-

sen and Kleiber (2002), Zeileis, Leisch, Hornik, and Kleiber (2002), and Sibbertsen (2004a).

Page 136: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 126

0 100 200 300 400−0.1

0.0

0.1

0.2

0.3

0.4

ACF for value−weighted index

0 100 200 300 400−0.1

0.0

0.1

0.2

0.3

0.4

ACF for equal−weighted index

0 100 200 300 400−0.1

0.0

0.1

0.2

0.3

PACF for value−weighted index

0 100 200 300 400−0.1

0.0

0.1

0.2

0.3

PACF for equal−weighted index

0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07

−11

−9−8

−7−6

Log Smoothed Spectral Density of VW

0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07

−12

−10

−8−6

Log Smoothed Spectral Density of EW

Figure 9.1: Sample autocorrelation function of the absolute series of daily simple returnsfor the CRSP value-weighted (left top panel) and equal-weighted (right top panel) indexes.Sample partial autocorrelation function of the absolute series of daily simple returns for theCRSP value-weighted (left middle panel) and equal-weighted (right middle panel) indexes.The log smoothed spectral density estimation of the absolute series of daily simple returnsfor the CRSP value-weighted (left bottom panel) and equal-weighted (right bottom panel)indexes.

9.2.1 Long Memory versus Structural Breaks

It is a well known stylized fact that many financial time series such as squares or absolute

values of returns or volatilities, even returns themselves behave as if they had long memory;

see Ding, Granger and Engle (1993) and Sibbertsen (2004b). On the other hand, it is also well

known that long memory is easily confused with structural change, in the sense that the

slow decay of empirical autocorrelations which is typical for a time series with long memory

is also produced when a shortmemory time series exhibits structural breaks. Therefore it

is of considerable theoretical and empirical interest to discriminate between these sources of

Page 137: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 127

slowly decaying empirical autocorrelations.

Structural break is another type of nonstationarity arises when the population re-

gression function changes over the sample period. This may occur because of changes in

economic policy, changes in the structure of the economy or industry, events that change

the dynamics of specific industries or firm related quantities such as inventories, sales, and

production, etc. If such changes, called breaks, occur then regression models that neglect

those changes lead to a misleading inference or forecasting.

Breaks may result from a discrete change (or changes) in the population regression coef-

ficients at distinct dates or from a gradual evolution of the coefficients over a longer period

of time. Discrete breaks may be a result of some major changes in economic policy or in the

economy (oil shocks) while “gradual” breaks, population parameters evolve slowly over time,

may be a result of slow evolution of economic policy. The former might be characterized by

an indicator function and latter can be described by a smooth transition function.

If a break occurs in the population parameters during the sample, then the OLS regression

estimates over the full sample will estimate a relationship that holds on “average”. Now

the question is how to test breaks.

9.2.2 Testing for Breaks (Instability)

Tests for breaks in the regression parameters depend on whether the break date is know

or not. If the date of the hypothesized break in the coefficients is known, then the null

hypothesis of no break can be testing using a dummy or indicator variable. For example,

consider the following model:

yt =

β0 + β1yt−1 + δ1 xt + ut, if t ≤ τ ,(β0 + γ0) + (β1 + γ1)yt−1 + (δ1 + γ2)xt + ut, if t > τ ,

where τ denotes the hypothesized break date. Under the null hypothesis of no break, H0 :

γ0 = γ1 = γ2 = 0, and the hypothesis of a break can be tested using the F-statistic. This is

called a Chow test proposed by Chow (1960) for a break at a known break date. Indeed,

the above structural break model can be regarded as a special case of the following trending

time series model

yt = β0(t) + β1(t) yt−1 + δ1(t)xt + ut.

Page 138: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 128

For more discussions, see Cai (2007). If there is a distinct break in the regression function,

the date at which the largest Chow statistic occurs is an estimator of the break date.

If there are more variables or more lags, this test can be extended by constructing binary

variable interaction variables for all the dependent variables. This approach can be modified

to check for a break in a subset of the coefficients. The break date is unknown in most of

the applications but you may suspect that a break occurred sometime between two dates,

τ0 and τ1. The Chow test can be modified to handle this by testing for break at all possible

dates t in between τ0 and τ1, then using the largest of the resulting F-statistics to test for a

break at an unknown date. This modified test is often called Quandt likelihood ratio (QLR)

statistic or the supWald or supF statistic:

supF = maxF (τ0), F (τ0 + 1), · · · , F (τ1).

Since the supF statistic is the largest of many F-statistics, its distribution is not the same

as an individual F-statistic. The critical values for supF statistic must be obtained from a

special distribution. This distribution depends on the number of restriction being tested, m,

τ0, τ1, and the subsample over which the F-statistics are computed expressed as a fraction

of the total sample size. Other types of F -tests are the average F and exponential F given

by

aveF =1

τ1 − τ0 + 1

τ1∑

j=τ0

Fj

and

expF = log

(1

τ1 − τ0 + 1

τ1∑

j=τ0

exp(Fj/2)

).

For details on modified F -tests, see the papers by Hansen (1992) and Andrews (1993).

For the large-sample approximation to the distribution of the supF statistic to be a good

one, the subsample endpoints, τ0 and τ1, can not be too close to the end of the sample. That

is why the supF statistic is computed over a “trimmed” subset of the sample. A popular

choice is to use 15% trimming, that is, to set for τ0 = 0.15T and τ1 = 0.85T . With 15%

trimming, the F-statistic is computed for break dates in the central 70% of the sample. Table

9.1 presents the critical values for supF statistic computed with 15% trimming. This table

is from Stock and Watson (2003) and you should check the book for a complete table. The

Page 139: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 129

Table 9.1: Critical Values of the QLR statistic with 15% Trimming

Number of restrictions (m) 10% 5% 1%1 7.12 8.68 12.162 5.00 5.86 7.783 4.09 4.71 6.024 3.59 4.09 5.125 3.26 3.66 4.536 3.02 3.37 4.127 2.84 3.15 3.828 2.69 2.98 3.579 2.58 2.84 3.3810 2.48 2.71 3.23

supF test can detect a single break, multiple discrete breaks, and a slow evolution of the

regression parameters.

Three classes of structural change tests (or tests for parameter instability) which have

been receiving much attention in both the statistics and econometrics communities but have

been developed in rather loosely connected lines of research are unified by embedding them

into the framework of generalized M-fluctuation tests (Zeileis and Hornik (2003)). These

classes are tests based on maximum likelihood scores (including the Nyblom-Hansen test),

on F statistics (supF, aveF, expF tests) and on OLS residuals (OLS-based CUSUM and

MOSUM tests; see Chu, Hornik and Kuan (1995), which is a special case of the so called

empirical fluctuation process, termed as efp in Zeileis, Leisch, Hornik, and Kleiber (2002)

and Zeileis and Hornik (2003)). Zeileis (2005) showed that representatives from these classes

are special cases of the generalized M-fluctuation tests, based on the same functional central

limit theorem, but employing different functionals for capturing excessive fluctuations. After

embedding these tests into the same framework and thus understanding the relationship

between these procedures for testing in historical samples, it is shown how the tests can

also be extended to a monitoring situation. This is achieved by establishing a general M-

fluctuation monitoring procedure and then applying the different functionals corresponding

to monitoring with ML scores, F statistics and OLS residuals. In particular, an extension of

the supF test to a monitoring scenario is suggested and illustrated on a real world data set.

In R, there are two packages strucchange developed by Zeileis, Leisch, Hornik, and

Page 140: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 130

Kleiber (2002) and segmented to provide several testing methods for testing breaks. They

can be downloaded at http://cran.cnr.berkeley.edu/doc/packages/strucchange.pdf

and http://cran.cnr.berkeley.edu/doc/packages/segmented.pdf, respectively.

Example 9.2: We use the data build-in in R for the minimal water flow of the Nile River

when planning the Ashwan Dam; see Hurst (1951), yearly data from 1871 to 1970 with 100

observations and the data name Nile. Also, we might use the data build-in the package

longmemo in R, yearly data with 633 observations from 667 to 1284 and the data name

NileMin. Here we use R to run the data set in Nile. As a result, the p-value for testing break

Time

F sta

tistic

s

1890 1910 1930 1950

020

4060

Time

Nile

1880 1900 1920 1940 1960

600

800

1000

1200

1400

OLS−based CUSUM test

Time

Empir

ical fl

uctua

tion p

roce

ss

1880 1900 1920 1940 1960

−10

12

3

OLS−based CUSUM test with alternative boundaries

Time

Empir

ical fl

uctua

tion p

roce

ss

1880 1900 1920 1940 1960

−2−1

01

23

Figure 9.2: Break testing results for the Nile River data: (a) Plot of F -statistics. (b) Thescatterplot with the breakpoint. (c) Plot of the empirical fluctuation process with linearboundaries. (d) Plot of the empirical fluctuation process with alternative boundaries.

is very small (see the computer output) so that H0 is rejected. The details are summarized

in Figures 9.2(a)-(b). It is clear from Figure 9.2(b) that there is one breakpoint for the Nile

River data: the annual flows drop in 1898 because the first Aswan dam was built. To test

the null hypothesis that the annual flow remains constant over the years, we also compute

OLS-based CUSUM process and plot with standard linear and alternative boundaries in

Figures 9.2(c)-(d).

Page 141: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 131

Example 9.3: We build a time series model for real oil price (quarterly) listed in the tenth

column in file “ex9-3.csv”, ranged from the first quarter of 1959 to the third quarter of 2002.

Before we build such a time series model, we want to see if there is any structure change for

oil price. As a result, the p-value for testing break is very small (see the computer output)

so that H0 is rejected. The details are summarized in Figure 9.3. It is clear from Figure

Time

F sta

tistic

s

40 60 80 100 120 140

05

1015

2025

30

Time

op0 50 100 150

0.51.0

1.52.0

OLS−based CUSUM test

Time

Empir

ical fl

uctua

tion p

roce

ss

0 50 100 150

−2−1

01

2

OLS−based CUSUM test with alternative boundaries

Time

Empir

ical fl

uctua

tion p

roce

ss

0 50 100 150

−2−1

01

2

Figure 9.3: Break testing results for the oil price data: (a) Plot of F -statistics. (b) Scatter-plot with the breakpoint. (c) Plot of the empirical fluctuation process with linear boundaries.(d) Plot of the empirical fluctuation process with alternative boundaries.

9.3(b) that there is one breakpoint for the oil price. We also compute OLS-based CUSUM

process and plot with standard linear and alternative boundaries in Figures 9.3(c)-(d).

If we consider the quarterly price level (CPI) listed in the eighth column in file “ex9-3.csv”

(1959.1 – 2002.3), we want to see if there is any structure change for quarterly consumer

price index. The details are summarized in Figure 9.4. It is clear from Figure 9.4(b) that

there would be one breakpoint for the consumer price index. We also compute OLS-based

CUSUM process and plot with standard linear and alternative boundaries in Figures 9.4(c)-

(d). Please thank about the conclusion for CPI!!! Do you believe this result? If

not, what happen?

Page 142: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 132

Time

F sta

tistic

s

40 60 80 100 120 140

020

040

060

0

Time

cpi

0 50 100 150

5010

015

0

OLS−based CUSUM test

Time

Empir

ical fl

uctua

tion p

roce

ss

0 50 100 150

−6−4

−20

OLS−based CUSUM test with alternative boundaries

Time

Empir

ical fl

uctua

tion p

roce

ss

0 50 100 150

−6−4

−20

2

Figure 9.4: Break testing results for the consumer price index data: (a) Plot of F -statistics.(b) Scatterplot with the breakpoint. (c) Plot of the empirical fluctuation process with linearboundaries. (d) Plot of the empirical fluctuation process with alternative boundaries.

Sometimes, you would suspect that a series may either have a unit root or be a trend

stationary process that has a structural break at some unknown period of time and you would

want to test the null hypothesis of unit root against the alternative of a trend stationary

process with a structural break. This is exactly the hypothesis testing procedure proposed

by Zivot and Andrews (1992). In this testing procedure, the null hypothesis is a unit root

process without any structural breaks and the alternative hypothesis is a trend stationary

process with possible structural change occurring at an unknown point in time. Zivot and

Andrews (1992) suggested estimating the following regression:

xt =

µ+ β t+ αxt−1 +

∑ki=1 ci ∆xt−i + et, if t ≤ τ T ,

[µ+ θ] + [β t+ γ(t− TB)] + αxt−1 +∑k

i=1 ci ∆xt−i + et, if t > τ T ,(9.7)

where τ = TB/T is the break fraction. Model (9.7) is estimated by OLS with the break points

ranging over the sample and the t-statistic for testing α = 1 is computed. The minimum

t-statistic is reported. Critical values for 1%, 5% and 10% critical values are −5.34, −4.8

and −4.58, respectively. The appropriate number of lags in differences is estimated for each

value of τ . Please read the papers by Zivot and Andrews (1992) and Sadorsky (1999) for

Page 143: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 133

more details about this method and empirical applications.

9.2.3 Long Memory versus Trends

A more general problem than distinguishing long memory and structural breaks is in a

way the question if general trends in the data can cause the Hurst effect. The paper by

Bhattacharya et al (1983) was the first to deal this problem. They found that adding a

deterministic trend to a short memory process can cause spurious long memory. Now

the problem becomes more complicated for modeling long memory time series. For this

issue, we will follow Section 5 of Zeileis (2004a). Note that there are a lot of ongoing

research (theoretical and empirical) works in this area.

9.3 Computer Codes

#####################

# This is Example 9.1

#####################

z1<-matrix(scan("c:/res-teach/xiamen12-06/data/ex9-1.txt"),

byrow=T,ncol=5)

vw=abs(z1[,3])

n_vw=length(vw)

ew=abs(z1[,4])

postscript(file="c:/res-teach/xiamen12-06/figs/fig-9.1.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(3,2),mex=0.4,bg="light green")

acf(vw, ylab="",xlab="",ylim=c(-0.1,0.4),lag=400,main="")

text(200,0.38,"ACF for value-weighted index")

acf(ew, ylab="",xlab="",ylim=c(-0.1,0.4),lag=400,main="")

text(200,0.38,"ACF for equal-weighted index")

pacf(vw, ylab="",xlab="",ylim=c(-0.1,0.3),lag=400,main="")

text(200,0.28,"PACF for value-weighted index")

pacf(ew, ylab="",xlab="",ylim=c(-0.1,0.3),lag=400,main="")

text(200,0.28,"PACF for equal-weighted index")

Page 144: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 134

library(fracdiff)

d1=fracdiff(vw,ar=0,ma=0)

d2=fracdiff(ew,ar=0,ma=0)

print(c(d1$d,d2$d))

m1=round(log(n_vw)/log(2)+0.5)

pad1=1-n_vw/2^m1

vw_spec=spec.pgram(vw,spans=c(5,7),demean=T,detrend=T,pad=pad1,plot=F)

ew_spec=spec.pgram(ew,spans=c(5,7),demean=T,detrend=T,pad=pad1,plot=F)

vw_x=vw_spec$freq[1:1000]

vw_y=vw_spec$spec[1:1000]

ew_x=ew_spec$freq[1:1000]

ew_y=ew_spec$spec[1:1000]

scatter.smooth(vw_x,log(vw_y),span=1/15,ylab="",xlab="",col=6,cex=0.7,

main="")

text(0.03,-7,"Log Smoothed Spectral Density of VW")

scatter.smooth(ew_x,log(ew_y),span=1/15,ylab="",xlab="",col=7,cex=0.7,

main="")

text(0.03,-7,"Log Smoothed Spectral Density of EW")

dev.off()

##########################

# This is for Example 9.2

##########################

graphics.off()

library(strucchange)

library(longmemo) # not used

postscript(file="c:/res-teach/xiamen12-06/figs/fig-9.2.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(2,2),mex=0.4,bg="light blue")

#if(! "package:stats" %in% search()) library(ts)

## Nile data with one breakpoint: the annual flows drop in 1898

Page 145: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 135

## because the first Ashwan dam was built

data(Nile)

## test whether the annual flow remains constant over the years

fs.nile=Fstats(Nile ~ 1)

plot(fs.nile)

print(sctest(fs.nile))

plot(Nile)

lines(breakpoints(fs.nile))

## test the null hypothesis that the annual flow remains constant

## over the years

## compute OLS-based CUSUM process and plot

## with standard and alternative boundaries

ocus.nile=efp(Nile ~ 1, type = "OLS-CUSUM")

plot(ocus.nile)

plot(ocus.nile, alpha = 0.01, alt.boundary = TRUE)

## calculate corresponding test statistic

print(sctest(ocus.nile))

dev.off()

#########################

# This is for Example 9.3

#########################

y=read.csv("c:/res-teach/xiamen12-06/data/ex9-3.csv",header=T,skip=1)

op=y[,10] # oil price

postscript(file="c:/res-teach/xiamen12-06/figs/fig-9.3.eps",

horizontal=F,width=6,height=6)

par(mfrow=c(2,2),mex=0.4,bg="light pink")

op=ts(op)

fs.op=Fstats(op ~ 1) # no lags and covariate

#plot(op,type="l")

Page 146: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 136

plot(fs.op)

print(sctest(fs.op))

## visualize the breakpoint implied by the argmax of the F statistics

plot(op,type="l")

lines(breakpoints(fs.op))

ocus.op=efp(op~ 1, type = "OLS-CUSUM")

plot(ocus.op)

plot(ocus.op, alpha = 0.01, alt.boundary = TRUE)

## calculate corresponding test statistic

print(sctest(ocus.op))

dev.off()

cpi=y[,8]

cpi=ts(cpi)

fs.cpi=Fstats(cpi~1)

print(sctest(fs.cpi))

postscript(file="c:/res-teach/xiamen12-06/figs/fig-9.4.eps",

horizontal=F,width=6,height=6)

#win.graph()

par(mfrow=c(2,2),mex=0.4,bg="light yellow")

#plot(cpi,type="l")

plot(fs.cpi)

## visualize the breakpoint implied by the argmax of the F statistics

plot(cpi,type="l")

lines(breakpoints(fs.cpi))

ocus.cpi=efp(cpi~ 1, type = "OLS-CUSUM")

plot(ocus.cpi)

plot(ocus.cpi, alpha = 0.01, alt.boundary = TRUE)

## calculate corresponding test statistic

print(sctest(ocus.cpi))

dev.off()

Page 147: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 137

9.4 References

Andrews, D.W.K. (1993). Tests for parameter instability and structural change with un-known change point. Econometrica, 61, 821-856.

Bhattacharya, R.N., V.K. Gupta and W. Waymire (1983). The Hurst effect under trends.Journal of Applied Probability, 20, 649-662.

Beran, J. (1994). Statistics for Long-Memory Processes. Chapman and Hall, London.

Brockwell, P.J. and Davis, R.A. (1991). Time Series Theory and Methods. New York:Springer.

Chan, N.H. and C.Z. Wei (1988). Limiting distributions of least squares estimates ofunstable autoregressive processes. Annals of Statistics, 16, 367-401.

Chow, G.C. (1960). Tests of equality between sets of coefficients in two linear regressions.Econometrica, 28, 591-605.

Chu, C.S.J, K. Hornik and C.M. Kuan (1995). MOSUM tests for parameter constancy.Biometrika, 82, 603-617.

Ding, Z., C.W.J. Granger and R.F. Engle (1993). A long memory property of stock returnsand a new model. Journal of Empirical Finance, 1, 83-106.

Hamilton, J.D. (1994). Time Series Analysis. Princeton University Press.

Hansen, B.E. (1992). Testing for parameter instability in linear models. Journal of PolicyModeling, 14, 517-533.

Haslett, J. and A.E. Raftery (1989). Space-time modelling with long-memory dependence:Assessing Ireland’s wind power resource (with discussion). Applied Statistics, 38, 1-50.

Hsu, C.-C., and C.-M. Kuan. (2000). Long memory or structural changes: Testing methodand empirical examination. Working paper, Department of Economics, National Cen-tral University, Taiwan.

Hurst, H.E. (1951). Long-term storage capacity of reservoirs. Transactions of the AmericanSociety of Civil Engineers, 116, 770-799.

Granger, C.W.J. (1966). The typical spectral shape of an economic variable. Econometrica,34, 150-161.

Kramer, W., P. Sibbertsen and C. Kleiber (2002). Long memory versus structural changein financial tim series. Allgemeines Statistisches Archiv, 86, 83-96.

Nyblom, J. (1989). Testing for the constancy of parameters over time. Journal of theAmerican Statistical Association, 84, 223-230.

Page 148: Advanced Topics in Analysis of Economic and Financial …wise.xmu.edu.cn/Master/News/NewsPic/20071229171021634.pdf · Advanced Topics in Analysis of Economic and Financial Data Using

CHAPTER 9. LONG MEMORY MODELS AND STRUCTURAL CHANGES 138

Sadorsky, P. (1999). Oil price shocks and stock market activity. Energy Economics, 21,449-469.

Sibbertsen, P. (2004a). Long memory versus structural breaks: An overview. StatisticalPapers, 45, 465-515.

Sibbertsen, P. (2004b). Long memory in volatilities of German stock returns. EmpiricalEconomics, 29, 477-488.

Stock, J.H. and M.W. Watson (2003). Introduction to Econometrics. Addison-Wesley.

Tiao, G.C. and R.S. Tsay (1983). Consistency properties of least squares estimates ofautoregressive parameters in ARMA models. Annals of Statistics, 11, 856-871.

Tsay, R.S. (2005). Analysis of Financial Time Series, 2th Edition. John Wiley & Sons,New York.

Zeileis, A. (2005). A unified approach to structural change tests based on ML scores, Fstatistics, and OLS residuals. Econometric Reviews, 24, 445-466.

Zeileis A. and Hornik, K. (2003). Generalized M-fluctuation tests for parameter instabil-ity. Report 80, SFB “Adaptive Information Systems and Modelling in Economics andManagement Science”. URL http://www.wu-wien.ac.at/am/reports.htm#80.

Zeileis, A., F. Leisch, K. Hornik and C. Kleiber (2002). strucchange: An R package fortesting for structural change in linear regression models. Journal of Statistical Software,7, Issue 2, 1-38.

Zivot, E. and D.W.K. Andrews (1992). Further evidence on the great crash, the oil priceshock and the unit root hypothesis. Journal of Business and Economics Statistics, 10,251-270.