1 non-seasonal box-jenkins models. 2 four-step iterative procedures 1)model identification...
TRANSCRIPT
1
Non-Seasonal Box-Jenkins Models
2
Four-step iterative procedures
1) Model Identification
2) Parameter Estimation
3) Diagnostic Checking
4) Forecasting
3
Step One: Model Identification
4
Model Identification
I. Stationarity
II. Theoretical Autocorrelation Function (TAC)
III. Theoretical Partial Autocorrelation Function (TPAC)
IV. Sample Partial Autocorrelation Function (SPAC)
V. Sample Autocorrelation Function (SAC)
5
Stationarity (I)
A sequence of jointly dependent random variables
is called a time series
}:{ tyt
6
Stationarity (II)
Stationary process Properties :
.),()3(
.])[()()2(
.)()1(22
tallforyyCov
tallforuyEyVar
tallforuyE
kktt
yytt
yt
7
Stationarity (III)
Example: The white noise series {t } ’s are iid as N(0,
2). Note that
.tallfor0),(Cov)3(
.tallfor)(E)(Var)2(
.tallfor0)(E)1(
stt
22tt
t
8
Stationarity (IV)
Three basic Box-Jenkins models for a stationary time series {yt } :
(1) Autoregressive model of order p (AR(p))
i.e., yt depends on its p previous values
(2) Moving Average model of order q (MA(q))
i.e., yt depends on q previous random error terms
,2211 tptpttt yyyy
,2211 qtqtttty
9
Stationarity (V)
Three basic Box-Jenkins models for a stationary time series {yt } : (3) Autoregressive-moving average model of order
p and q (ARMA(p,q))
i.e., yt depends on its p previous values and q previous random error terms
,2211
2211
qtqttt
ptpttt yyyy
10
AR(1) (I)
Simple AR(1) process without drift
.
)LL1(
L1y
or
y)L1()L(
)operatorshiftbacktheisLwhere(Lyy
)noisewhiteiswhere(yy
2t2
11t1t
2211t
1
tt
tt1
tt1t
tt1t1t
11
AR(1) (II)
Now,
Var(yt) and cov(yt, yt-s) are finite if and only if
|1| < 1, which is the stationarity requirement
for an AR(1) process.
.1
),()3(
.1
)()2(
.0)()1(
21
12
21
2
tallforyyCov
tallforyVar
tallforyE
s
stt
t
t
12
AR(1) (IV)
Special Case: 1 = 1
It is a “random walk” process. Now,
Thus,
.1 ttt yy
1
0
.t
jjtty
.||),()3(
.)()2(
.0)()1(
2
2
tallforstyyCov
tallfortyVar
tallforyE
stt
t
t
13
AR(1) (V)
Consider,
yt is a homogeneous non-stationary series.
The number of times that the original series
must be differenced before a stationary series
results is called the order of integration.
.1
t
ttt yyy
14
Theoretical Autocorrelation Function (TAC) (I)
Autoregressive (AR) ProcessesConsider an AR(1) process without drift :
Recall that
.1
),()3(
.1
)()2(
.0)()1(
21
12
021
2
tallforyyCov
tallforyVar
tallforyE
s
s
stt
t
t
.11 ttt yy
15
Theoretical Autocorrelation Function (TAC) (II)
The autocorrelation function at lag k is
So for a stationary AR(1) process, the TAC dies down gradually as k increases.
.
...,2,1,0
1
0
k
kk kfor
16
Theoretical Autocorrelation Function (TAC) (III)
Consider an AR(2) process without drift :
The TAC functions are
.2211 tttt yyy
.2
1
,1
2211
2
21
22
2
11
kforkkk
17
Theoretical Autocorrelation Function (TAC) (IV)
Then the TAC dies down according to a mixture of damped exponentials and/or damped sine waves.
In general, the TAC of a stationary AR process dies down gradually as k increases.
18
Theoretical Autocorrelation Function (TAC) (V)
Moving Average (MA) ProcessesConsider a MA(1) process without drift :
Recall that.11 ttty
.10
1),()3(
.)1())2(
.0)()1(
21
21
20
s
syyCov
tallforVar(y
tallforyE
sstt
t
t
19
Theoretical Autocorrelation Function (TAC) (VI)
Therefore the TAC of the MA(1) process is
The TAC of the MA(1) process “cuts off” after lag k=1.
.10
11 2
1
1
0
k
k
kk
20
Theoretical Autocorrelation Function (TAC) (VII)
Consider a MA(2) process :
The TAC of a MA(2) process cuts off after 2 lags.
.20
,1
,1
)1(
22
21
22
22
21
211
kfork
21
Theoretical Partial Autocorrelation Function (TPAC) (I)
Autoregressive Processes
By the definition of the PAC, the parameter k is the kth PAC kk. Therefore, the partial autocorrelation function at lag k is
As mentioned before, if k=1, then
That is, PAC=AC. The TPAC of an AR(1) process “cuts off” after lag 1.
.kkk
.1111
22
Theoretical Partial Autocorrelation Function (TPAC) (II)
Moving Average Processes
Consider
which is a stationary AR process with infinite order. Thus, the partial autocorrelation decays towards zero as j increases.
11
11
,j
tjtj
ttt
y
y
23
Summary of the Behaviors of TAC and TPAC (I)
Behaviors of TAC and TPAC for general non-seasonal models
Model TAC TPAC
Dies down Dies down
Autoregressive of order p
Moving Average of order q
Mixed Autoregressive-Moving Average of order (p,q)
Dies down Cuts off after lag p
Cuts off after lag q
Dies down
tptpttt zzzz 2211
qtqttttz 2211
qtqttt
ptpttt zzzz
2211
2211
24
Summary of the Behaviors of TAC and TPAC (II)
Model TAC TPACFirst-order autoregressive
Second-order autoregressive
Dies down in a damped exponential fashion; specifically:
Cuts off after lag 1
Dies down according to a mixture of damped exponentials and/or damped sine waves; specifically:
Cuts off after lag 2
Behaviors of TAC and TPAC for specific non-seasonal models
11 k forkk
,1
,1
2
21
22
2
11
32211 k forkkk
ttt zz 11
tttt zzz 2211
25
Summary of the Behaviors of TAC and TPAC (III)
Behaviors of TAC and TPAC for specific non-seasonal models Model TAC TPACFirst-order moving average
Second-order moving average
Cuts off after lag 1; specifically: Dies down in a fashion dominated by damped exponential decay
Cuts off after lag 2; specifically: Dies down according to a mixture of damped exponentials and/or damped sine waves
20
,1 2
1
11
k fork
.20
,1
,1
)1(
22
21
22
22
21
211
k fork
11 tttz
2211 ttttz
26
Summary of the Behaviors of TAC and TPAC (IV)
Behaviors of TAC and TPAC for specific non-seasonal models Model TAC TPACMixed autoregressive-movingaverage of order (1,1)
Dies down in a damped exponential fashion; specifically:
Dies down in a fashion dominated by damped exponential decay
2
,21
))(1(
11
112
1
11111
k forkk
1111 tttt zz
27
Sample Autocorrelation Function (SAC) (I)
For the working series zb, zb+1, , zn, the sample autocorrelation at lag k is
where
n
btt
kn
btktt
k
zz
zzzzr
2
1
bn
zz
n
btt
28
Sample Autocorrelation Function (SAC) (II)
rk measures the linear relationship between
time series observations separated by a lag of k time units
The Standard error of rk is
The trk statistic is
.1
211
1
2
bn
r
s
k
jj
rk
.k
k
r
kr s
rt
29
Sample Autocorrelation Function (SAC) (III)
Behaviors of SAC
(1) The SAC can cut off. A spike at lag k exists in the SAC if rk is statistically large. If
Then rk is considered to be statistically large. The SAC cuts off after lag k if there are no spikes at lags greater than k in the SAC.
2kr
t
30
Sample Autocorrelation Function (SAC) (IV)
(2) The SAC dies down if this function does not cut off but rather decreases in a ‘steady fashion’. The SAC can die down in(i) a damped exponential fashion(ii) a damped sine-wave fashion(iii) a fashion dominated by either one of
or a combination of both (i) and (ii).The SAC can die down fairly quickly or
extremely slowly.
31
Sample Autocorrelation Function (SAC) (V)
The time series values zb, zb+1, …, zn should be considered stationary, if the SAC of the time series values either cuts off fairly quickly or dies down fairly quickly.
However if the SAC of the time series values zb, zb+1, …, zn dies down extremely slowly, then the time series values should be considered non-stationary.
32
Sample Partial Autocorrelation Function (SPAC) (I)
The sample partial autocorrelation at lag k is
where
for j = 1, 2, …, k-1.
,3,21
,1
1
1,1
1
1,1
1
k ifrr
rrr
k ifr
rk
jkjk
k
jjkjkk
kk
jkkkkjkkj rrrr ,1,1
33
Sample Partial Autocorrelation Function (SPAC) (II)
rkk may intuitively be thought of as the sample autocorrelation of time series observations separated by a lag k time units with the effects of the intervening observations eliminated.
The standard error of rkk is
The trkk statistic is
.1
1
bns
kkr
.kk
kk
r
kkr s
rt
34
Sample Partial Autocorrelation Function (SPAC) (III)
Behaviors of SPAC similar to its of the SAC. The only difference is that rkk is considered to be statistically large if
for any k.2kkrt
35
Sample Partial Autocorrelation Function (SPAC) (IV)
The behaviors of the SAC and the SPAC of a time series data help to tentatively identify a Box-Jenkins model.
Each Box-Jenkins model is characterized by its theoretical autocorrelation (TAC) function and its theoretical partial autocorrelation (TPAC) function.
36
Step Two: Parameter Estimation
37
Parameter Estimation
Given n observations y1, y2, …, yn, the likelihood
function L is defined to be the probability of obtaining the data actually observed.
For non-seasonal Box-Jenkins models, L will be a function of the , ’s, ’s and
2 given y1, y2, …, yn.
The maximum likelihood estimators (m.l.e.) are those value of the parameters for which the data actually observed are most likely, that is, the values that maximize the likelihood function L.
38
Step Three: Diagnostic Checking
39
Diagnostic Checking
Often it is not straightforward to determine a single model that most adequately represents the data generating process. The suggested tests include
(1) residual analysis,
(2) overfitting,
(3) model selection criteria.
40
Residual Analysis
If an ARMA(p,q) model is an adequate representation of the data generating process, then the residuals should be uncorrelated.
Use the Box-Pierce statistic
or the Ljung-Box-Pierce statistic
2)(
1
2 ~)()()( qpk
k
ll erdnkQ
2)(
1
2* ~
)()2)(()( qpk
k
l
l
ldn
erdndnkQ
41
Overfitting
If an ARMA(p,q) model is specified, them we could estimate an ARMA(p+1,q) or an ARMA(p,q+1) process.
Then we check the significance of the additional parameters (but be aware of multicollinearity problems),
42
Model Selection Criteria
Akaike Information Criterion (AIC)AIC = -2 ln(L) + 2k
Schwartz Bayesian Criterion (SBC)SBC = -2 ln(L) + k ln(n)
where L = likelihood function k = number of parameters to be
estimated, n = number of observations.
Ideally, the AIC and SBC will be as small as possible
43
Step Four: Forecasting
44
Forecasting
Given a the stationary series zb, zb+1, , zt, we would like to forecast the value zt+l.
= l-step-ahead forecast of zt+l made at time t,
= l-step-ahead forecast error =
The l-step-ahead forecast is derived using the minimum mean square error forecast and is given by
)(ˆ tz lt
)(te lt).(ˆ tzz ltlt
),,,|)(ˆ 1 tbbltlt zzzzE(tz
45
Forecasting with AR(1) model (I)
The AR(1) time series model is
where t ~ N(0,).
1-step-ahead point forecast
ttt zz 11
).,,,|(
),,,|()(ˆ
111
1111
tbbtt
tbbttt
zzzEz
zzzzEtz
46
Forecasting with AR(1) model (II)
Recall that t+1 is independent of zb, zb+1, …, zn and it has a zero mean. Thus,
The forecast error is
Then the variance of the forecast error is
.
)(
)(ˆ)(
1
111
111
t
ttt
ttt
zz
tzzte
.)var()](var[ 211 tt te
.)(ˆ 11 tt ztz
47
Forecasting with AR(1) model (III)
2-step-ahead point forecast
The forecast error is
The forecast error variance is
).(ˆ)(ˆ 112 tztz tt
.)(
))(ˆ(
))(ˆ(
)(ˆ)(
112112
1112
11211
222
tttt
ttt
ttt
ttt
te
tzz
tzz
tzzte
.)1()var()](var[ 21
2221
21122 ttt te
48
Forecasting with AR(1) model (IV)
l-step-ahead point forecast
The forecast error is
The forecast error variance is
.2)(ˆ)(ˆ 11 l fortztz ltlt
.1
)(
11
1
22
111
l for
te
tl
ltltltlt
.11
1)](var[ 2
21
21
lfortel
lt
49
Forecasting with MA(1) model (I)
The MA(1) model is
where t ~ N(0,).
l-step-ahead point forecast
11 tttz
.2
,1)(ˆ 1
l for
l fortz t
lt
50
Forecasting with MA(1) model (II)
The forecast error is
The variance of forecast error is
.2)1(
,1)](var[
221
2
l for
l forte lt
.2
,1)(
111
1
l for
l forte
ltt
tlt