examining factors to better understand autoregressive models

9
This article was downloaded by: [University of Illinois Chicago] On: 28 October 2014, At: 18:17 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK The American Statistician Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/utas20 Examining Factors to Better Understand Autoregressive Models Wayne A. Woodward a , Henry L. Gray a , James R. Haney a & Alan C. Elliott a a Wayne A. Woodward is Professor and Chair , Henry L. Gray is C. F. Frensley Professor Emeritus , and James R. Haney is Visiting Assistant Professor , Department of Statistical Science, Southern Methodist University, Dallas, TX 75275. Alan C. Elliott is a Faculty Member, Division of Biostatistics, Department of Clinical Sciences, University of Texas Southwestern Medical Center at Dallas, TX 75390 . Published online: 01 Jan 2012. To cite this article: Wayne A. Woodward, Henry L. Gray, James R. Haney & Alan C. Elliott (2009) Examining Factors to Better Understand Autoregressive Models, The American Statistician, 63:4, 335-342, DOI: 10.1198/tast.2009.08283 To link to this article: http://dx.doi.org/10.1198/tast.2009.08283 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http:// www.tandfonline.com/page/terms-and-conditions

Upload: alan-c

Post on 05-Mar-2017

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Examining Factors to Better Understand Autoregressive Models

This article was downloaded by: [University of Illinois Chicago]On: 28 October 2014, At: 18:17Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: MortimerHouse, 37-41 Mortimer Street, London W1T 3JH, UK

The American StatisticianPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/utas20

Examining Factors to Better UnderstandAutoregressive ModelsWayne A. Woodwarda, Henry L. Graya, James R. Haneya & Alan C. Elliotta

a Wayne A. Woodward is Professor and Chair , Henry L. Gray is C. F. Frensley ProfessorEmeritus , and James R. Haney is Visiting Assistant Professor , Department of StatisticalScience, Southern Methodist University, Dallas, TX 75275. Alan C. Elliott is a FacultyMember, Division of Biostatistics, Department of Clinical Sciences, University of TexasSouthwestern Medical Center at Dallas, TX 75390 .Published online: 01 Jan 2012.

To cite this article: Wayne A. Woodward, Henry L. Gray, James R. Haney & Alan C. Elliott (2009) Examining Factors toBetter Understand Autoregressive Models, The American Statistician, 63:4, 335-342, DOI: 10.1198/tast.2009.08283

To link to this article: http://dx.doi.org/10.1198/tast.2009.08283

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose ofthe Content. Any opinions and views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be reliedupon and should be independently verified with primary sources of information. Taylor and Francis shallnot be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and otherliabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Examining Factors to Better Understand Autoregressive Models

Teacher’s Corner

Examining Factors to Better Understand Autoregressive Models

Wayne A. WOODWARD, Henry L. GRAY, James R. HANEY, and Alan C. ELLIOTT

Most introductory time series analysis courses and texts de-signed for students in statistics, engineering, applied sciences,econometrics, and finance are sorely lacking when it comes toproviding much depth of understanding about the basic featuresof the autoregressive (AR) model obtained for a realization. Weshow that factoring an AR model is useful for conveying infor-mation about the model, including stationarity as well as timeand frequency domain information. While it is common prac-tice to express models with unit roots or seasonal models in afactored form, we recommend factoring the stationary compo-nents of the model and using a factor table to summarize theinformation about model factors. We also discuss the decompo-sition of an AR realization into its additive components, whichhelps students visualize the information in the factor table. Thefactor table and decomposition can be used as teaching tools,so that students will no longer view the AR model as a “blackbox” from which to obtain forecasts, spectral estimates, and soon. We illustrate these techniques using an AR(9) model fit tothe sunspot data. We provide a link to software that can be usedin time series courses to implement the ideas discussed in thearticle.

KEY WORDS: Autocorrelations; Factor tables; Spectral den-sity; Time series.

1. INTRODUCTION

The autoregressive (AR) model is widely used for modelingstationary and certain types of nonstationary processes. The ARmodel is introduced in standard time series texts (e.g., Brock-well and Davis 1998, 2002; Tsay 2002; Shumway and Stoffer2006; Box, Jenkins, and Reinsel 2008; Cryer and Chan 2008)and has proven very useful for such applications as forecast-ing and spectral estimation. The process Xt, t = 0,±1,±2, . . . ,

with E(Xt) = μ and Var(Xt ) = σ 2X is, loosely speaking, a pth-

order AR process, AR(p), if

Xt − μ − ϕ1(Xt−1 − μ) − · · · − ϕp(Xt−p − μ) = at , (1)

Wayne A. Woodward is Professor and Chair (E-mail: [email protected]),Henry L. Gray is C. F. Frensley Professor Emeritus (E-mail: [email protected]),and James R. Haney is Visiting Assistant Professor (E-mail: [email protected]),Department of Statistical Science, Southern Methodist University, Dallas, TX75275. Alan C. Elliott is a Faculty Member, Division of Biostatistics, Depart-ment of Clinical Sciences, University of Texas Southwestern Medical Center atDallas, TX 75390 (E-mail: [email protected]).

where the ϕi ’s are real constants with ϕp �= 0 and where at ismean-0 white noise with variance σ 2

a (see Brockwell and Davis1998). In the remainder of this article we assume, without lossof generality, that μ = 0.

The typical procedure for fitting an AR(p) model to a set ofdata is to first identify the order p using the Akaike informa-tion criterion (AIC; Akaike 1974) or another order identifica-tion method, and then estimate the coefficients. The features ofthe resulting model, especially when p > 2, usually will be un-recognizable from the estimated coefficients themselves. In thisarticle we discuss methods to help students (and practitioners)better understand the characteristics of the fitted model. We firstoutline some useful facts about AR models.

It is often convenient to write the AR model using operatornotation based on the backshift operator defined by BkXt =Xt−k . Using this notation, the mean-0 form of the model in (1)is written as

(1 − ϕ1B − · · · − ϕpBp)Xt = at , (2)

which is sometimes written as ϕ(B)Xt = at . Corresponding tothe operator ϕ(B) is the characteristic polynomial ϕ(r) = 1 −ϕ1r −· · ·−ϕprp , where r is a real or complex number. It is wellknown that an AR(p) process is causal and weakly stationary(henceforth we simply use the term “stationary”) if and only ifthe roots of the characteristic equation,

ϕ(r) = 1 − ϕ1r − · · · − ϕprp = 0, (3)

all lie outside the unit circle, that is, have modulus > 1 (e.g.,Brockwell and Davis 1998).

Another important feature of stationary AR(p) models isthat for k > 0, the autocorrelation function, defined by ρk =Cov(Xt ,Xt+k)/σ

2X , satisfies the linear homogeneous difference

equation with constant coefficients

ρk − ϕ1ρk−1 − · · · − ϕpρk−p = 0. (4)

The general solution of this type of difference equation, in thecase of no repeated roots of the characteristic equation, is givenby

ρk =p∑

j=1

cj r−kj , (5)

where rj , j = 1, . . . , p, denote the p roots of the characteris-tic equation and the cj ’s are (possibly complex) constants (seeBox, Jenkins, and Reinsel 2008).

We also note that the spectral density, SX , of an AR processXt has a simple formula given by

SX(f ) = σ 2a

σ 2X|1 − ϕ1e−2πif − · · · − ϕpe−2πifp|2 ,

© 2009 American Statistical Association DOI: 10.1198/tast.2009.08283 The American Statistician, November 2009, Vol. 63, No. 4 335

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s C

hica

go]

at 1

8:17

28

Oct

ober

201

4

Page 3: Examining Factors to Better Understand Autoregressive Models

Figure 1. Realizations from the AR(3) and AR(4) models in (6).

where the frequency f is the reciprocal of the period length. Inthis article we plot 10 log10(SX), and for simplicity we refer tothese as plots of the spectral density.

A Challenge Question. Consider the following AR(3) andAR(4) models:

(A) Xt − 1.95Xt−1 + 1.85Xt−2 − 0.855Xt−3 = at ,

(B) Xt − 0.2Xt−1 − 1.23Xt−2

+ 0.26Xt−3 + 0.66Xt−4 = at ,(6)

(C) Xt − 0.38Xt−1 − 0.09Xt−2

− 0.19Xt−3 − 0.29Xt−4 = at ,

(D) Xt − Xt−1 − 0.5Xt−2 + 0.8Xt−3 − 0.7Xt−4 = at .

Figure 1 shows realizations of length n = 100 from these fourmodels in random order. Based on an examination of the modelcoefficients, can you tell which realization goes with whichmodel?

This is a difficult problem, and in fact it cannot be solvedwithout further information. Clearly, the relationship betweenthe coefficients and the behavior of the data is not readily de-tectable from examination of the model coefficients, and tomost students the model at this point appears to be a sort of“black box.” We have observed that most standard time seriestexts do not provide much guidance for understanding the mod-els.

In this article we recommend factoring AR(p) models, andalso recommend using the factor table as a teaching (and re-search) tool for understanding the fundamental characteristics(or latent components) of an AR(p) model. We use factor ta-bles to answer the foregoing challenge question. The first twoauthors have used factor tables for many years to introduce stu-dents to the AR model. In our experience, this is a very effectiveway to illustrate the basic features of a given AR(p) model, for

demonstrating such concepts as the dominance of factors as-sociated with roots substantially closer to the unit circle thanother roots in the model and the effect of differencing an ARrealization. Finally, we discuss a decomposition of a realizationinto additive components that correspond to information in thefactor table. We illustrate the methods discussed herein usingan AR(9) model fit to the classical sunspot data.

2. AR(1) AND AR(2) MODELS

Before discussing the factoring of AR(p) models, we reviewsome relevant facts concerning stationary AR models. Theseare typically covered in standard time series texts such as thoselisted previously, and we discuss them only briefly here. Devel-oping an understanding of the behavior of an AR(p) processrequires an understanding of basic properties of simple AR(1)

and AR(2) models.

(a) Facts about the AR(1) model, Xt − ϕ1Xt−1 = at . In thiscase the characteristic equation is given by 1 − ϕ1r = 0 sothat the root, r1, of the characteristic equation is given byr1 = ϕ−1

1 . As a consequence, an AR(1) process is stationaryif and only if |ϕ1| < 1. It is instructive to note that because|r−1

1 | = |ϕ1|, the proximity of the root to the unit circle canbe observed by simple examination of ϕ1.

(i) The autocorrelation function of a stationary AR(1)

process is given by ρk = ϕk1 for k ≥ 0; that is, in (5),

c1 = 1 and r1 = ϕ−11 . The autocorrelation function

is a damped exponential if ϕ1 > 0 and an oscillatingdamped exponential if ϕ1 < 0.

(ii) As a result of the autocorrelations in (i), realizationsfrom an AR(1) model with ϕ1 > 0 tend to be aperiodicwith a general “wandering” behavior. When ϕ1 < 0,the realizations tend to oscillate back and forth acrossthe mean.

(iii) The spectral density, SX(f ), has a peak at f = 0 ifϕ1 > 0 and at f = 0.5 if ϕ1 < 0.

(b) Facts about the AR(2) model, Xt −ϕ1Xt−1 −ϕ2Xt−2 = at .In this case the characteristic equation is 1 − ϕ1r − ϕ2r

2 =0, and the features of the AR(2) process depend on the na-ture of the roots of this quadratic equation.

Case 1 (The roots of the characteristic equation arecomplex). In this case, |r−1

i | = √−ϕ2, where ri , i = 1,2,are complex conjugate roots of the characteristic equation(Box, Jenkins, and Reinsel 2008). This provides a usefulmeasure of how close the roots are to the unit circle.

(i) In the complex roots case, ρk in (5) has two termswhere c2 = c∗

1 and r2 = r∗1 , where z∗ denotes the com-

plex conjugate, and (5) takes the simplified form

ρk = d(√−ϕ2)

k sin(2πf0 + ψ), (7)

where d and ψ are real constants and f0 = 12π

×cos−1(

ϕ12√−ϕ2

). The important point is that the auto-

correlation function has a damped sinusoidal behaviorwith “system frequency” f0, where 0 < f0 < 0.5.

(ii) As a result of the autocorrelations in (i), realizationsfrom an AR(2) model in this case will be pseudocyclic

336 Teacher’s Corner

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s C

hica

go]

at 1

8:17

28

Oct

ober

201

4

Page 4: Examining Factors to Better Understand Autoregressive Models

(a) (b)

Figure 2. Realizations, autocorrelations, and spectral densities for two AR(2) processes with complex roots.

with frequency f0 as given in (i), that is, with period1/f0.

(iii) If√−ϕ2 is sufficiently close to 1, then the spectral

density will have a peak near f0.

Figure 2 shows realizations, autocorrelations, and spectraldensities for two AR(2) processes associated with complexroots. For the model shown in Figure 2(a), f0 = 0.097, the re-alization shows a pseudoperiodic behavior, and the autocorrela-tions have the form of a damped sinusoidal, both with a periodof about 10 units. Also, the spectral density has a peak at ap-proximately f = 0.1. In the model shown in Figure 2(b), f0 =0.344, the realization displays a higher-frequency pseudoperi-odic behavior, with a period of about 3 units, while the autocor-relations have a damped sinusoidal behavior, also with a periodof about 3. The spectral density has a peak at about f = 1/3.It should be noted that the roots of the characteristic equationassociated with the AR(2) model in Figure 2(a) are closer to theunit circle than those in Figure 2(b) (which can be easily seenby examining

√−ϕ2), and, consequently, for the model in Fig-ure 2(a), the pseudoperiodic behavior in the data is more pro-nounced, the autocorrelations damp more slowly, and the peakin the spectral density is sharper.

Case 2 (The roots of the characteristic equation are real). Inthis case the characteristic equation factors as (1 − r−1

1 r)(1 −r−1

2 r) = 0, where r1 and r2 are the roots of the characteristicequation. Visually, ρk will have the appearance of a dampedexponential or a mixture of damped exponentials, and realiza-tions from these models will display a mixture of the behaviorsmanifested by their autocorrelations. The spectral density willhave a peak at f = 0, at f = 0.5, or at both f = 0 andf = 0.5,depending on the signs of r1 and r2.

3. THE AR(p) MODEL AND FACTOR TABLES

The key to providing students with an enhanced understand-ing of the behavior of AR(p) processes is to help them real-

ize that the AR(1) and AR(2) models serve as its “buildingblocks.” Clearly, the roots of the pth-order characteristic equa-tion ϕ(r) = 1 − ϕ1r − · · · − ϕprp = 0 will be real, complex (inwhich case they occur as complex conjugate pairs), or a com-bination of real and complex roots. Thus ϕ(r) can be written infactored form, where each factor has real coefficients and is ei-ther a linear factor or an irreducible quadratic factor. In general,

ϕ(r) =m∏

j=1

(1 − α1j r − α2j r2), (8)

where α2j may be 0. Properties of the roots satisfying ϕ(r) = 0in (8) follow those already discussed for the AR(1) and AR(2)

models and are discussed only briefly here.

(i) Factors associated with real roots (linear factors). Factorsof (8) associated with real roots are linear factors, that is,α2j = 0. The absolute value of the reciprocal of the asso-ciated root is given by |r−1

j | = |α1j |, and stationarity im-plies that |α1j | < 1. “System frequencies,” f0, associatedwith these real roots are f0 = 0 if α1j > 0 and f0 = 0.5 ifα1j < 0.

(ii) Factors associated with complex roots (quadratic factors).In this case the factor 1 − α1j r − α2j r

2 is an irreduciblequadratic factor, and the two roots associated with this fac-tor are complex conjugate pairs. For measuring the prox-imity of a root to the unit circle, it is useful to note that|r−1

j | = √−α2j .

Each j = 1, . . . ,m in (8) corresponds to either a real root or apair of complex conjugate roots, and we assume, without loss ofgenerality, that j = 1, . . . ,mr correspond to the real roots andj = mr + 1, . . . ,m correspond to pairs of complex conjugateroots. Here we consider only the case in which there are norepeated roots of the characteristic equation. The general ex-pression for the autocorrelation of an AR(p) from (5) can be

The American Statistician, November 2009, Vol. 63, No. 4 337

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s C

hica

go]

at 1

8:17

28

Oct

ober

201

4

Page 5: Examining Factors to Better Understand Autoregressive Models

rewritten as

ρk =mr∑j=1

cj (r−1j )k +

m∑j=mr+1

(cj1(r−1j )k + cj2(r

∗−1j )k)

=mr∑j=1

cjαk1j +

m∑j=mr+1

dj (√−α2j )

k sin(2πf0j + ψj), (9)

where

f0j = 1

2πcos−1

(α1j

2√−α2j

)(10)

and the cj ’s, j = 1, . . . ,mr , and the dj ’s and ψj ’s, j = mr +1, . . . ,m, are real constants. The pedagogical value of (9) is thatit shows that ρk for an AR(p) process behaves like a mixture ofdamped exponentials and/or damped sinusoids.

Knowledge of the factors of ϕ(r) is the key to understandingthe correlation structure of an AR(p), as well as the behavior ofits realizations. Consequently, we find it very instructive to fac-tor an AR(p) model into the first-order and irreducible second-order factors. We illustrate these ideas using Models (A)–(D).

3.1 Model A: Xt − 1.95Xt−1 + 1.85Xt−2 − 0.855Xt−3 = at

The characteristic polynomial is ϕ(r) = 1−1.95r +1.85r2 −0.855r3, which can be written in the factored form of (8) asϕ(r) = (1 − 0.95r)(1 − r + 0.9r2). In this case m = 2, j = 1corresponds to the root r1 = 1/0.95 = 1.05 and j = 2 corre-sponds to the complex conjugate pair of roots, 0.556 ± 0.896i

of the quadratic equation 1 − r + 0.9r2 = 0. Based on the fac-torization in (8), the model can be written in factored form as

(1 − 0.95B)(1 − B + 0.9B2)Xt = at . (11)

When examining just the unfactored form of Model A, the fea-tures of this AR(3) model are unclear; however, inspection ofthe factored form in (11) immediately reveals a positive realroot and a pair of complex conjugate roots. The absolute recip-rocals of the roots can be easily seen to be 0.95 for the realroot and

√0.9 = 0.9487 for the complex root. Thus clearly the

model is stationary, and the roots are all about the same distancefrom the unit circle.

The system frequency associated with the positive real rootis 0, and from (10), the system frequency associated with thecomplex conjugate roots is given by f02 = 1

2πcos−1( 1

2√

0.9) =

0.16. The factored form can be obtained in the general pth-order case via a standard numerical polynomial root findingroutine (e.g., Press et al. 2007).

In our introductory time series classes at SMU, we use a fac-tor table such as the one shown in Table 1 to help students bet-ter comprehend the information about the first-order and irre-ducible second-order factors in the AR(p) model. This presen-tation has appeared in the literature (Woodward and Gray 1983,

Table 1. Factor table for Model A.

AR factors Roots (rj ) |r−1j

| f0j

1 − B + 0.9B2 0.556 ± 0.896i 0.95 0.161 − 0.95B 1.053 0.95 0

1993, 1995; Gray and Woodward 1986) but has not been in-cluded in the standard time series texts. In the factor table, foreach first-order or irreducible second-order factor of the model,we show the

(i) roots of the characteristic equation(ii) proximity of the roots to the unit circle (by tabling the ab-

solute reciprocal of the roots)(iii) system frequencies f0j .

We also use the convention that the roots are listed in the fac-tor table in decreasing order of their nearness to the unit circle,because the closer to the unit circle, the stronger the autocorre-lation effect.

From the information in Table 1, it is clear that realizationsfrom this model will display wandering behavior associatedwith the factor (1 − 0.95B) and pseudocyclic behavior with pe-riod 1/0.16 = 6 associated with (1 − B + 0.9B2). To answerthe Challenge Question from Section 1 regarding Model A, it isclear that the only realization in Figure 1 that has these charac-teristics is Realization 2.

Figure 3 shows autocorrelations and spectral density forModel A. From (9), the autocorrelation function is given by

ρk = 0.612(0.95)k

+ 0.392(√

0.9)k sin(2πf02k + 1.445), (12)

where f02 = 0.16, and thus ρk is the sum of damped exponen-tial and damped sinusoidal terms and has the appearance of adamped sinusoid along a damped exponential path. The spec-tral density has two peaks, one at f = 0 corresponding to thepositive real root and the other at approximately f = 0.16 as-sociated with the pair of complex conjugate roots.

3.2 Model B: Xt − 0.2Xt−1 − 1.23Xt−2 + 0.26Xt−3 +0.66Xt−4 = at

This model can be factored as (1 − 1.8B + 0.95B2)(1 +1.6B + 0.7B2)Xt = at . The associated factor table is given inTable 2. With the factored form, the factor table shows twoirreducible second-order factors. The factor associated with asystem frequency of 0.06 is associated with roots closer to theunit circle than the other factor, which is associated with a sys-tem frequency of 0.45. Thus realizations will display a mixtureof low-frequency behavior associated with a period of about1/0.06 = 16.7 and high-frequency behavior with a period of

Figure 3. Autocorrelations and spectral density for Model A.

338 Teacher’s Corner

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s C

hica

go]

at 1

8:17

28

Oct

ober

201

4

Page 6: Examining Factors to Better Understand Autoregressive Models

Table 2. Factor table for Model B.

AR factors Roots (rj ) |r−1j

| f0j

1 − 1.8B + 0.95B2 0.95 ± 0.39i 0.97 0.061 + 1.6B + 0.7B2 −1.15 ± 0.35i 0.83 0.45

1/0.45 = 2.2. Also, because the low-frequency behavior is as-sociated with roots closer to the unit circle, this behavior willtend to dominate. Regarding the Challenge Question, it can bereadily seen that Realization 3 in Figure 1 is the only one thatis consistent with the model characteristics.

Figure 4 shows the autocorrelations and spectral density forthis model. The autocorrelations are dominated by the behav-ior of the root closest to the unit circle, that is, the root associ-ated with a period of 1/0.06 = 16.7. The spectral density showspeaks near both system frequencies, with the peak near 0.06 thehigher of the two. This model provides a good illustration of thefollowing teaching points:

(i) Roots closer to the unit circle have a more dominant effect.(ii) Frequency behavior is better displayed in both the spectral

density and the factor table than in the autocorrelations.

3.3 Model C: Xt − 0.38Xt−1 − 0.09Xt−2 − 0.19Xt−3 −0.29Xt−4 = at

This model can be factored as (1 − 0.98B)(1 + 0.5B2)(1 +0.6B)Xt = at . The associated factor table is given in Table 3.With the factored form, the factor table shows three factors, twolinear factors and one irreducible second-order factor. There is apositive real root very close to the unit circle, a pair of complexroots associated with system frequency 0.25, and a negative realroot, which is the root farthest removed from the unit circle. Thetable clearly shows that the model characteristics are dominatedby the positive real (near unit) root. Recall that realizations frommodels with a dominant root near +1 will have a wanderingbehavior. The other two factors, associated with frequencies of0.25 and 0.5, will manifest as (weaker) high-frequency behaviorin realizations. Returning to the Challenge Question, it is crystalclear that the dominant wandering behavior with mild evidenceof a higher-frequency behavior in Realization 1 of Figure 1 isconsistent with the information conveyed in the factor table forModel C.

Figure 4. Autocorrelations and spectral density for Model B.

Table 3. Factor table for Model C.

AR factors Roots (rj ) |r−1j

| f0j

1 − 0.98B 1.02 0.98 01 + 0.5B2 ±1.4i 0.70 0.251 + 0.6B −1.67 0.60 0.50

While there is some evidence of the higher-frequency be-havior in Realization 1 of Figure 1, the autocorrelations shownin Figure 5 have a slowly damping exponential behavior, withvery little evidence of the two high-frequency factors. The spec-tral density in Figure 5 shows a strong peak at f = 0, withmuch smaller (nearly undetectable) peaks at f = 0.25 and 0.5.A point to be made with this example is that the factored formand the factor table can be used to identify roots near the unitcircle.

3.4 Model D: Xt − Xt−1 − 0.5Xt−2 + 0.8Xt−3 −0.7Xt−4 = at

This model can be factored as (1 − 1.25B)(1 + B)(1 −0.75B + 0.56B2)Xt = at . The associated factor table is givenin Table 4.

Answering the Challenge Question for this example is veryeasy. With the factored form, the factor table clearly shows thatthis model has a positive real root inside the unit circle. Thusthis model is explosively nonstationary and corresponds to Re-alization 4 of Figure 1. No further analysis would be meaning-ful.

4. SOME TEACHING POINTS

In this section we give three ideas for using the information inthe factor table to help students better understand AR(p) mod-els.

4.1 Effect of Differencing

Model C is an example of a model with a (near) unit root.It is common practice to difference data such as that in therealization from Model C (i.e., Realization 1 in Figure 1) todemonstrate the more subtle behavior that is overwhelmed bythe (near) unit root (see, e.g., Box, Jenkins, and Reinsel 2008).To illustrate, we difference the realization from Model C and

Figure 5. Autocorrelations and spectral density for Model C.

The American Statistician, November 2009, Vol. 63, No. 4 339

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s C

hica

go]

at 1

8:17

28

Oct

ober

201

4

Page 7: Examining Factors to Better Understand Autoregressive Models

Table 4. Factor table for Model D.

AR factors Roots (rj ) |r−1j

| f0j

1 − 1.25B 0.8 1.25 01 + B −1.0 1.0 0.51 − 0.75B + 0.56B2 0.67 ± 1.15i 0.75 0.17

show the differenced data at the top of Figure 6. We fit an AR(3)

model based on Burg’s maximum entropy estimates (see Brock-well and Davis 2002) to the differenced data, Yt = Xt −Xt−1 =(1 − B)Xt , and the resulting fitted AR(3) model for the differ-enced data is

(1 + 0.66B + 0.49B2 + 0.30B3)Yt = at . (13)

As before, the features of this model are unclear from themodel coefficients themselves; however, the factored form ofthe model in (13) is (1 + 0.03B + 0.47B2)(1 + 0.63B)Yt = at ,and the corresponding factor table is given in Table 5.

Comparing the factors in Table 5 with those in Table 3 al-lows the students to recognize that the AR(3) model fit to thedifferenced data retains the non-unit root characteristics of theoriginal model. Figure 6 also shows the autocorrelations andspectral density for this fitted AR(3) model. There it can be seenthat the non-unit root characteristics of the original model areretained. This is particularly clear in the spectral density, wherethe peaks in the spectrum at f = 0.25 and f = 0.5 are muchmore distinct than those in Figure 5. It is useful to point out tothe students that these characteristics were clearly displayed inthe factor table in Table 3 for the original, undifferenced model.

4.2 Dominance of Roots Substantially Closerto the Unit Circle

A point that we emphasize to students is that roots near theunit circle dominate realization behavior whether or not theroots are near positive unity. The factor table for Model A(Table 1) shows that the associated characteristic equation has

Figure 6. Data in Realization 1 of Figure 1 (from Model C) differ-enced along with autocorrelations and spectral density for the AR(3)

model fit to differenced data.

Table 5. Factor table for AR(3) model fit to differenced data fromModel C.

AR factors Roots (rj ) |r−1j

| f0j

1 + 0.03B + 0.47B2 −0.03 ± 1.45i 0.69 0.251 + 0.63B −1.58 0.63 0.50

a positive real root and a pair of complex conjugate rootsAlso, all roots of the characteristic equation for Model A have|r−1| ≈ 0.95, so they are all fairly close to the unit circle.

We find it instructional to consider the alternative models (infactored form)

(1 − 0.95B)(1 − 0.76B + 0.5B2)Xt = at (14)

and

(1 − 0.7B)(1 − B + 0.9B2)Xt = at , (15)

obtained by moving the complex conjugate roots associatedwith the characteristic equation for Model A further from theunit circle while leaving the positive real root unchanged in(14), and moving the positive real root away from the unit circlewhile leaving the complex conjugate roots unchanged in (15).It should be pointed out that in (14), the coefficient on B inthe quadratic term is changed from 1.0 to 0.76, to maintain thesame system frequency of f0 = 0.16. In both cases, any rootsthat were moved further from the unit circle have |r−1| ≈ 0.7.Figures 7 and 8 show realizations, autocorrelations, and spec-tral densities for these two new models. From Figure 7, it canbe seen that the positive real root associated with the factor1 − 0.95B now dominates the behavior of the realization, au-tocorrelations, and spectral density. In fact, one might recom-mend differencing the data because of the near-unit root behav-ior displayed by the realization and autocorrelations. Figure 8shows that moving the real root further from the unit circle inmodel (15) caused the cyclic behavior to dominate the realiza-tion, autocorrelations, and spectral density. This is analogous tothe domination of the positive real root seen in the AR(3) modelin (14) and in Model C.

Figure 7. Realization, autocorrelations, and spectral density formodel (14).

340 Teacher’s Corner

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s C

hica

go]

at 1

8:17

28

Oct

ober

201

4

Page 8: Examining Factors to Better Understand Autoregressive Models

Figure 8. Realization, autocorrelations, and spectral density formodel (15).

4.3 Visualizing Autoregressive Components

Another instructive device for understanding the features ofan AR(p) realization is the decomposition of the realizationinto additive components. In particular, if xt is a realization froman AR(p) for which there are no repeated roots of the charac-teristic equation, then xt = x

(1)t + x

(2)t + · · · + x

(m)t , where m is

as specified in (8) and x(j)t corresponds to the j th factor in the

factor table, is a realization from an AR(1) for first-order fac-tors, and is a realization from an ARMA(2,1) for irreduciblesecond-order factors. This can be shown using a partial fractiondecomposition (Box, Jenkins, and Reinsel 2008, sec. 3.2.1).West (1997) used a state-space formulation to actually constructthese components.

The point to emphasize to students is that each componentcorresponds to a row in the factor table. We illustrate the ped-agogical usefulness of this decomposition using Model A. TheAR(3) model (1 − 1.94B + 1.81B2 − 0.78B3)Xt = at is fitto Realization 2 in Figure 1 from Model A using Burg esti-mates. Figure 9 shows the realization, along with the additiveAR(1) and ARMA(2,1) components obtained using the state-space approach suggested by West (1997). It can be seen thatthe ARMA(2,1) component contains the pseudo-cyclic portionof the realization, while the wandering behavior in the realiza-tion is captured by the AR(1) component.

We believe that using these decompositions is a great wayto illustrate the underlying structure of AR(p) models. We rec-ommend their use in introductory time series courses even if theconcept of state-space models is not covered.

5. APPLICATION TO THE SUNSPOT DATA

Figure 10(a) shows the annual sunspot numbers updated to2008. These data classically show a 10- to 11-year cycle. De-spite the obvious nonlinearity related to the relative constancyin the troughs and variability in the peaks, earlier versions of thesunspot data have been widely modeled using AR models, and

Figure 9. Realization 2 in Figure 1 from Model A (a), along with theadditive AR(1) (b) and ARMA(2,1) (c).

we use that approach here for illustrative purposes. The AICselects an AR(9) model whose Burg estimates are given by

(1 − 1.16B + 0.39B2 + 0.18B3 − 0.15B4 + 0.10B5

− 0.02B6 − 0.02B7 + 0.06B8 − 0.24B9)(Xt − 52.2) = at .

Figure 10. Yearly sunspot numbers (1749–2008), along with the firstthree additive components.

The American Statistician, November 2009, Vol. 63, No. 4 341

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s C

hica

go]

at 1

8:17

28

Oct

ober

201

4

Page 9: Examining Factors to Better Understand Autoregressive Models

Table 6. Factor table for AR(9) model fit to sunspot data.

AR factors Roots (rj ) |r−1j

| f0j

1 − 1.61B + 0.95B2 0.84 ± 0.58i 0.98 0.0951 − 0.95B 1.056 0.95 01 − 0.62B + 0.73B2 0.42 ± 1.09i 0.85 0.191 + 1.46B + 0.61B2 −1.20 ± 0.44i 0.78 0.441 + 0.55B + 0.59B2 −0.46 ± 1.21i 0.77 0.31

Clearly, any model that describes the behavior of the sunspotdata must account for the 10- to 11-year cycle. However, ex-amination of the parameters of the AR(9) model provides noinformation about how this cyclic behavior or any other fea-tures in the data are accounted for by the model. To obtain thissort of information, we examine the factor table given in Ta-ble 6. This table shows that the roots closest to the unit circleare associated with a system frequency of 0.095, and thus a pe-riod of 10.5. The next-closest root to the unit circle is a positivereal root associated with the factor 1 − 0.95B . The remainingthree factors are associated with higher-frequency behavior andare not as dominant, lying farther from the unit circle. Similarinformation is given by the spectrum (not shown here).

Figure 10(b)–(d) shows the first three additive componentsassociated with the factors of the AR(9) model. It can be clearlyseen that the 10- to 11-year cycle dominates, while the first-order factor associated with component 2 seems to be related tothe variation in peak heights. The higher-frequency factors playmuch more minor roles.

6. CONCLUDING REMARKS

Examining the factors (and additive components) clears upmuch of the “mystery” regarding higher-order AR models. Fac-toring highlights the relative proximity of roots to the unit circleand is useful pedagogically to show how roots near the unit cir-cle dominate the behavior of autocorrelation functions, realiza-tions, and spectral densities. It is common practice to expressmodels with unit roots or seasonal models in a factored form,such as

(1 − B)ϕ(B)Xt = at ,

(1 − B)2ϕ(B)Xt = at ,

(1 − B12)ϕ(B)Xt = at ,

where ϕ(B) denotes the stationary component of the model.It is also instructive to factor the stationary component ϕ(B),however. Simple-to-use software that produces factor tables fora specified AR(p) model is available at http://www.texasoft.com/ factor. This software also allows the user to fit an AR(p)

model to a realization (using Burg estimates), print out the fac-tor table, and plot the additive components associated with thefitted model.

Factoring also can help students better understand ARMA(p,

q) models. Examining the factors in the AR and moving aver-age components of a model can aid the understanding of in-teractions among AR and moving average factors; for example,similar factors in the AR and moving average parts of the modelwill have a nearly canceling effect on each other.

[Received December 2008. Revised May 2009.]

REFERENCES

Akaike, H. (1974), “A New Look at the Statistical Model Identification,” IEEETransactions on Automatic Control, AC-19, 716–723.

Box, G. E. P., Jenkins, G. M., and Reinsel, G. C. (2008), Time Series Analysis:Forecasting and Control, Hoboken, NJ: Wiley.

Brockwell, P. J., and Davis, R. A. (1998), Time Series: Theory and Methods,New York: Springer-Verlag.

(2002), Introduction to Time Series and Forecasting, New York:Springer-Verlag.

Cryer, J. D., and Chan, K. (2008), Time Series Analysis With Applications in R(2nd ed.), New York: Springer-Verlag.

Gray, H. L., and Woodward, W. A. (1986), “A New ARMA Spectral Estimator,”Journal of the American Statistical Association, 81, 1100–1108.

Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P. (2007),Numerical Recipes: The Art of Scientific Computing (3rd ed.), Cambridge,England: Cambridge University Press.

Shumway, R. H., and Stoffer, D. S. (2006), Time Series Analysis With R Exam-ples (2nd ed.), New York: Springer-Verlag.

Tsay, R. (2002), Analysis of Financial Time Series, New York: Wiley.

West, M. (1997), “Time Series Decomposition,” Biometrika, 84, 489–494.

Woodward, W. A., and Gray, H. L. (1983), “A Comparison of Autoregressiveand Harmonic Component Models for the Lynx Data,” Journal of the RoyalStatistical Society, Ser. A, 146, 71–73.

(1993), “Global Warming and the Problem of Testing for Trend inTime Series Data,” Journal of Climate, 6, 953–962.

(1995), “Selecting a Model for Detecting the Presence of a Trend,”Journal of Climate, 8, 1929–1937.

342 Teacher’s Corner

Dow

nloa

ded

by [

Uni

vers

ity o

f Il

linoi

s C

hica

go]

at 1

8:17

28

Oct

ober

201

4