lect1
DESCRIPTION
TRANSCRIPT
1WWW.VU.EDU.AU
BUSINESS STATISTICS
BEO1106WEEK 11
Dr. Hubert Fernando and Dr. Sidney Lung2012
BUSINESS AND LAW
Business Statistics
TIME-SERIES ANALYSIS
Textbook Chapter 14
3
TIME-SERIES ANALYSIS
• According to classical time-series analysis an observed time series is the combination of some pattern and random variations.
The aim is to separate them from each other in order toa) describe the historical pattern in the data,
and tob) prepare forecasts by projecting the revealed historical pattern
into the future.
• The pattern itself is likely to contain some, or all, of the following three components: trend, seasonal and cyclical.
4
Trend: The long-term general change in the level of the data with a duration of longer than a year.
It can be linear (straight line)
ii XbbY 10
or non-linear (smooth curve),
Sep-70 Sep-80 Sep-90 Sep-00
120
100
80
60
40
20
0
Hourly earnings: Manufacturing: Major seven countries1995=100
Jan-61 Jan-71 Jan-81 Jan-91 Jan-01
140
120
100
80
60
40
20
0
Broad money: (sa): Sweden1995=100
t
Yt
5
Seasonal variations: Regular fluctuations within a period of no longer than a year.
Seasonal variations are usually associated with the four seasons of the year.
Dec-82 Dec-85 Dec-88 Dec-91 Dec-94 Dec-97 Dec-00
600
500
400
300
200
100
0
Hungary: Commodity output: Cement'000 tonnes
Jan-83 Jan-86 Jan-89 Jan-92 Jan-95 Jan-98 Jan-01
2500
2000
1500
1000
500
0
Australia: Retail turnover: Department stores$m
t
Yt
6
Peak
Cyclical variations: Fluctuations around the long-term trend, lasting longer than a year.
Beginning trough
Ending trough
The time gap between the beginning trough and ending trough is the length of the cycle.
t
Yt
Dec-70 Dec-75 Dec-80 Dec-85 Dec-90 Dec-95 Dec-00
40000
35000
30000
25000
20000
15000
Aus: Dwelling units approved: Private: New housesNumber
Dec-60 Dec-70 Dec-80 Dec-90 Dec-00
900
800
700
600
500
400
300
Expenditure on GDP: Construction: United States: (sa)bln 96 USD
Cyclical variations are attributed to business cycles; to the ups and downs in the level of business activity.
7
• The random variations of the data comprise the deviations of the observed time series from the underlying pattern.
When this irregular component is strong compared to the (quasi-) regular components, it tends to hide the seasonal and cyclical variations, and it is difficult to be detached from the pattern.
However, if we manage to capture the trend, the seasonal and cyclical variations, the remaining changes do not have any discernible pattern, so they are totally unpredictable.
8
The four components of a time series ,T: trend, S: seasonal, C: cyclical, R: random) can be combined in different ways.
Additive: Multiplicative:
iiiii ICSTY iiiii ICSTY
e.g. If the trend is linear, these two models look as follows:
iiiii ICSXbbY )( 10 iiiii ICSXbbY )( 10
135 = 60+12+36+27 135 = 60 x 1.2 x 1.5 x 0.25
9
Jan-83 Jan-86 Jan-89 Jan-92 Jan-95 Jan-98 Jan-01
900
800
700
600
500
400
300
200
100
Australia: Retail turnover: Recreational goods$m
Dec-76 Dec-80 Dec-84 Dec-88 Dec-92 Dec-96 Dec-00
160
140
120
100
80
60
40
Austria: Domestic demand: Retail sales: Volume1995=100
These time series have an increasing linear trend component.
the fluctuations around this trend have the same intensity.
the fluctuations around this trend are more and more intensive.
10
SMOOTHING TECHNIQUES
Are used to reduce, the random fluctuations in a time series so as to more clearly expose the existence of the other components.
Example1The daily (Monday – Friday) sales figures during the last four weeks were recorded in a medium-size merchandising firm.
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Day
Sa
les
week 1 Week 2 Week 3 Week 4
11
3-day moving average Day Sales 3-day moving 3-day moving
sum average1 432 45 110.0 36.73 22 92.0 30.74 25 78.0 26.05 31 107.0 35.76 51 etc. etc.
110224543 67.363
110
Moving Average
0
10
20
30
40
50
60
70
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Day
Sa
les
MA(5)
MA(3)
Longer the period stronger the smoothing
12
CMA(4)
32.331.534.638.5etc.
To calculate a 4-period moving average, MA(4) must be placed between the second and third observations.
Since this makes interpretation and graphing difficult, we center the moving averages which are the 2-period moving averages of the 4-period moving averages.
4-day centered moving average CMA(4). Day Sales
1 432 453 224 255 316 517 41
MA(4)
33.830.832.337.040.0etc.
3.322
8.308.33
Centered Moving Average
13
Exponential Smoothing
1)1( iii EwwYE
where Ei : exponentially smoothed value for time period i ;Ei-1 : exponentially smoothed value for time period i -1;Yi : observed value for time period i ;w : smoothing constant, 0 < w < 1.
a) Assuming that Y has been observed from i = 1, this formula can be applied only from the second time period.
For i = 1 we set the smoothed value equal to the observed value, i.e. E1 = Y1
b) The smoothing constant determines the strength of smoothing, the larger the value of w the weaker the smoothing effect.
14
i
w = 0.2
Example 2:Using the quarterly Australian unemployed persons (in thousands) data for the years 1989-98,
a) Apply the exponential smoothing technique with W = 0.2 and W = 0.7.
Y E (W=0.2) E (W=0.7)1989 1 1735.6 1735.6 1735.6
2 1507.9 1690.1 1576.23 1450.2 1642.1 1488.04 1402.7 1594.2 1428.3
1990 1 1689.9 1613.3 1611.42 1621.4 1615.0 1618.4
etc. etc. etc.
6.173511 YE
1.16906.17358.09.15072.0
)1( 122
EWWYE
1.16421.16908.02.14502.0
)1( 233
EWWYE
15
Plot the time series and the exponential smoothed values on the same graph.
0
500
1000
1500
2000
2500
3000
3500
1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 41989 1990 1991 1992 1993 1994 1995 1996 1997 1998
Y S (w=0.2) S (w=0.7)
If w = 0.7, Ei is quite similar to Yi, i.e. there is very little smoothing.
However, if w = 0.2, Ei does not have the fluctuations of Yi, i.e. there is far more smoothing.
16
1) Trend analysis
HOW TO CAPTURE THE TREND, CYCLICAL AND SEASONAL COMPONENTS?
If we decompose a time series into the trend, seasonal and cyclical components, then we can construct a forecast by projecting these parts into the future.
The easiest way of isolating a long-term linear trend is by simple linear regression, where the independent variable is the time variable i.
and i is equal to 1 for the first time period in the sample and increases by one each period thereafter.
17
Example 3: The graph below shows Australian exports of footwear ($m) from 1988 through 2000.
1988 1990 1992 1994 1996 1998 2000
70
60
50
40
30
20
10
Exports: 85: Footwear: ANNUAL$m
This time series has an upward trend, which is linear.
Estimate a linear trend line using Excel.
First you have to create a time variable Xi and then regress fwexport on Xi.
year i fwexport1988 1 141989 2 231990 3 221991 4 301992 5 361993 etc. etc.
ii XY 505.4308.15ˆ
18
0
10
20
30
40
50
60
70
80
1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000
fwexport y-hat
Note: In the first year of the sample period i = 1 not 0.
In 1987 (i = 0) the trend value of footwear exports is 15.308 $m. 308.150 b
505.41 b Each year the value of footwear exports increases by 4.505 $m.
In 1988 (i = 1) $m 813.191505.4308.15ˆ1 Y
In 1999 (i = 12) $m 368.6912505.4308.15ˆ12 Y
ii XY 505.4308.15ˆ
19
2) Measuring the cyclical effect
Assume that the time series model is multiplicative and consists of only two parts: the trend and the cyclical components so that
iii CTY i
ii T
YC
Under these assumptions the cyclical effect can be measured by expressing the actual data as the percentage of the trend:
% 100ˆ
i
i
Y
Y
year t fwexport Y-hat Y/Y-hat*1001988 1 14 19.81 70.661989 2 23 24.32 94.581990 3 22 28.82 76.321991 4 30 33.33 90.011992 5 etc. etc. etc.
Calculate and plot the percentage of trend.
71100*81.19/14
So in 1988 the actual exports of footwear were about 29% below the trend line.
20
60
70
80
90
100
110
120
130
1988 1989 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000
Expansion phase
Contraction phase
We have assumed that the time series pattern does not have a seasonal component and that the random variations are negligible.
In Example 3 the first of these assumptions is certainly satisfied since the data is annual.
Boom
Recession
% 100ˆ ii YY
21
3) Measuring the seasonal effect
Depending on the nature of the time series, the seasonal variations can be captured in different ways.i. Assume, for example, that the time series does not contain a
discernible cyclical component and can be described by the following multiplicative model
iiii ISTY iii
i IST
Y
This suggests that dividing the estimated trend component (y-hat) into the time series we obtain an estimate for the product of the seasonal and random variations.
Seasonal factor: ii YY ˆ/
In order to remove the random variations from this ratio, we average the seasonal factors for each season and adjust these averages to ensure that they add up to the number of seasons.
22
the data is first divided by (centered) moving averages, which are supposed to capture the trend and cyclical components,
iiiii ISCTY
ii. When the time series model is multiplicative and has all four parts,
Then the seasonal factors and indices are calculated from this
ratio:
iii CTCMA
Note: The order of the centered moving average must be equal to the number of seasons. For example, we use 4-quarter CMA if the data is quarterly and seasonality has 4 phases a year, and we use 12-month CMA if the data is monthly and seasonality has 12 phases a year.
and the trend and cyclical components are estimated from the centered moving averages, instead of the original data.
iiii
i
i
i ISCT
Y
CMA
Y
i
i
CMA
Y
23
Dec-82 Dec-85 Dec-88 Dec-91 Dec-94 Dec-97 Dec-00
5000
4500
4000
3500
3000
2500
2000
1500
Retail turnover: Original: Household good retailing: QUARTERLY$m
Example 4: The graph below shows retail turnover for households goods ($m) for Australia from the second quarter of 1982 through the fourth quarter of 2000.
This time series has an upward linear trend and quarterly seasonal variations.
a) Estimate a linear trend line with Excel.
It probably has some cyclical variations too, but this third component seems to be less significant than the other two.
ii XY 604.36189.1589ˆ
24
b) Calculate the seasonal factors and the seasonal indices.
quarter t retail Y-hat Y/Y-hatJun-82 1 1553.2 1625.8 0.955Sep-82 2 1601.9 1662.4 0.964Dec-82 3 2052.2 1699.0 1.208Mar-83 4 1666.0 1735.6 0.960Jun-83 5 1680.4 1772.2 0.948
In order to find the seasonal indices the seasonal factors (Y/Y-hat) have to be grouped, averaged and, if necessary, adjusted.
955.08.1625/2.1553
ii XY 604.36189.1589ˆ
25
Sum 16.728 18.062 18.283 21.945 TotalAverage 0.929 0.951 0.962 1.155 3.997
Index 0.930 0.951 0.963 1.156 4.000
Year Q1 Q2 Q3 Q41982 0.955 0.964 1.2081983 0.960 0.948 0.962 1.2401984 0.948 0.890 0.905 1.163
etc. etc. etc. etc. etc.
1998 0.914 0.908 0.909 1.0311999 0.909 0.922 0.971 1.1292000 0.973 1.043 0.990 1.144
These seasonal indices suggest that in the March, June and September quarters retail turnover is expected to be 7.0, 4.9 and 3.7% below its trend value, while in the December quarter retail turnover is expected to be 15.6% above its trend value.
%0.93MarS
%1.95JunS %6.115DecS
%3.96SepS
4.0000.929 0.930
3.997
26
c) De-seasonalising a time series.quarter t retailJun-82 1 1553.2Sep-82 2 1601.9Dec-82 3 2052.2Mar-83 4 1666.0Jun-83 5 1680.4Sep-83 6 etc.
Seasonal indices:
%0.93MarS %8.94JunS %5.96SepS %7.115DecS
Seasonal indices can be used to deseasonalise a time series, i.e. to remove the seasonal variations from the data.
The seasonally adjusted data (in publications usually denoted as sa) is obtained by dividing the observed, unadjusted data by the seasonal indices.
e.g. For the June quarter of 1982 the seasonally adjusted retail turnover is
$m 2.16381008.94/2.1553
27
INTRODUCTION TO FORECASTING
• After having studied the historical pattern of a time series, if there is reason to believe that the most important features of the variable do
not change in the future, we can project the revealed pattern into the future in order to develop forecasts. • If a time series exhibits no (or hardly any) trend, cyclical and seasonal variations, exponential smoothing can provide a useful forecast for one period ahead:
ii EF 1
Example 5: (Refer Example 2) We have applied exponential smoothing with W = 0.2 and W = 0.7 on quarterly Australian unemployed persons (in thousands).
Nevertheless, just for illustration, let us forecast unemployment for the first quarter of 1999.
Since this time series does have some seasonal variations, exponential smoothing cannot be expected to forecast unemployment reasonably well.
28
unemployed E (W=0.7)1998 1 2461.4 2402.8
2 2210.9 2268.53 2221.3 2235.54 2102.6 2142.5
This is the smoothed value for the fourth quarter of 1998, and thus the forecast for the first quarter of 1999.
• If a time series exhibits a long-term (linear) trend and seasonal variations, we can use regression analysis to develop forecasts in two different ways.
We can forecast using the estimated trend and seasonal indices as:
iiiii SXbbSTF )( 10
29
Example 6: (Refer Example 4) Forecast retail turnover for households goods for the first quarter of 2001 applying the first approach can be implemented as follows.
i = 76, S76 = SMar = 0.930 and
Obtain the trend estimate from part a and the March seasonal index from part b so that
We have predicted retail turnover for households goods for the first quarter of 2001. Suppose we had another forecast value of 4203.4 for the same data and the same time period using a different forecasting model. How would we decide which forecast is more accurate?
ii XY 604.36189.1589ˆ
8.4064930.0)76604.36189.1589(ˆ7676 xxYF
30
Measuring the forecast Error • How can we decide which forecasting model is the most accurate in a given situation?
Forecast the variable of interest for a number of of time periods using alternative models and evaluate some measure(s) of forecast accuracy for each of these models. Among a number of possible criteria that can be used for this purpose a commonly used method is the,
Mean absolute deviation:1
1 n
t tt
MAD y Fn
31
Example 7: Two forecasting models were used to predict the future values of a time series. They are shown next, together with the actual values. For each model, calculate MAD to determine which was more accurate.
yt
Ft
Model 1 Model 2
6.0 7.5 6.36.6 6.3 6.77.3 5.4 7.19.4 8.2 7.5
et
Model 1 Model 2
-1.5 -0.3 0.3 -0.1 1.9 0.2 1.2 1.9
6.0 – 7.5
Model 1 : MAD = 4.9/4=1.225
| et |
Model 1 Model 2
1.5 0.3 0.3 0.1 1.9 0.2 1.2 1.9
Total: 4.9 2.5
Model 2 : MAD = 2.5/4=0.625
Model 2 is the more accurate.
| -1.5 |