forecasting hierarchical time series - rob j hyndman · pdf fileoutline 1 hierarchical time...
TRANSCRIPT
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
1
Rob J Hyndman
Forecastinghierarchical time series
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series Hierarchical time series 2
Introduction
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Manufacturing product hierarchiesNet labour turnoverPharmaceutical salesTourism demand by region and purpose
Forecasting hierarchical time series Hierarchical time series 3
Introduction
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Manufacturing product hierarchiesNet labour turnoverPharmaceutical salesTourism demand by region and purpose
Forecasting hierarchical time series Hierarchical time series 3
Introduction
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Manufacturing product hierarchiesNet labour turnoverPharmaceutical salesTourism demand by region and purpose
Forecasting hierarchical time series Hierarchical time series 3
Introduction
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Manufacturing product hierarchiesNet labour turnoverPharmaceutical salesTourism demand by region and purpose
Forecasting hierarchical time series Hierarchical time series 3
Introduction
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Examples
Manufacturing product hierarchiesNet labour turnoverPharmaceutical salesTourism demand by region and purpose
Forecasting hierarchical time series Hierarchical time series 3
Forecasting the PBS
Forecasting hierarchical time series Hierarchical time series 4
ATC drug classificationA Alimentary tract and metabolismB Blood and blood forming organsC Cardiovascular systemD DermatologicalsG Genito-urinary system and sex hormonesH Systemic hormonal preparations, excluding sex hor-
mones and insulinsJ Anti-infectives for systemic useL Antineoplastic and immunomodulating agentsM Musculo-skeletal systemN Nervous systemP Antiparasitic products, insecticides and repellentsR Respiratory systemS Sensory organsV Various
Forecasting hierarchical time series Hierarchical time series 5
ATC drug classification
A Alimentary tract and metabolism14 classes
A10 Drugs used in diabetes84 classes
A10B Blood glucose lowering drugs
A10BA Biguanides
A10BA02 Metformin
Forecasting hierarchical time series Hierarchical time series 6
Australian tourism
Forecasting hierarchical time series Hierarchical time series 7
Australian tourism
Forecasting hierarchical time series Hierarchical time series 7
Also split by purpose of travel:
Holiday
Visits to friends and relatives
Business
Other
Hierarchical/grouped time seriesA hierarchical time series is a collection ofseveral time series that are linked together in ahierarchical structure.
Example: Pharmaceutical products are organized ina hierarchy under the Anatomical TherapeuticChemical (ATC) Classification System.
A grouped time series is a collection of timeseries that are aggregated in a number ofnon-hierarchical ways.
Example: Australian tourism demand is grouped byregion and purpose of travel.
Forecasting hierarchical time series Hierarchical time series 8
Hierarchical/grouped time seriesA hierarchical time series is a collection ofseveral time series that are linked together in ahierarchical structure.
Example: Pharmaceutical products are organized ina hierarchy under the Anatomical TherapeuticChemical (ATC) Classification System.
A grouped time series is a collection of timeseries that are aggregated in a number ofnon-hierarchical ways.
Example: Australian tourism demand is grouped byregion and purpose of travel.
Forecasting hierarchical time series Hierarchical time series 8
Hierarchical/grouped time seriesA hierarchical time series is a collection ofseveral time series that are linked together in ahierarchical structure.
Example: Pharmaceutical products are organized ina hierarchy under the Anatomical TherapeuticChemical (ATC) Classification System.
A grouped time series is a collection of timeseries that are aggregated in a number ofnon-hierarchical ways.
Example: Australian tourism demand is grouped byregion and purpose of travel.
Forecasting hierarchical time series Hierarchical time series 8
Hierarchical/grouped time seriesA hierarchical time series is a collection ofseveral time series that are linked together in ahierarchical structure.
Example: Pharmaceutical products are organized ina hierarchy under the Anatomical TherapeuticChemical (ATC) Classification System.
A grouped time series is a collection of timeseries that are aggregated in a number ofnon-hierarchical ways.
Example: Australian tourism demand is grouped byregion and purpose of travel.
Forecasting hierarchical time series Hierarchical time series 8
Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.
Existing methods:ã Bottom-upã Top-downã Middle-out
How to compute forecast intervals?
Most research is concerned about relativeperformance of existing methods.
There is no research on how to deal withforecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.
Existing methods:ã Bottom-upã Top-downã Middle-out
How to compute forecast intervals?
Most research is concerned about relativeperformance of existing methods.
There is no research on how to deal withforecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.
Existing methods:ã Bottom-upã Top-downã Middle-out
How to compute forecast intervals?
Most research is concerned about relativeperformance of existing methods.
There is no research on how to deal withforecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.
Existing methods:ã Bottom-upã Top-downã Middle-out
How to compute forecast intervals?
Most research is concerned about relativeperformance of existing methods.
There is no research on how to deal withforecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.
Existing methods:ã Bottom-upã Top-downã Middle-out
How to compute forecast intervals?
Most research is concerned about relativeperformance of existing methods.
There is no research on how to deal withforecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.
Existing methods:ã Bottom-upã Top-downã Middle-out
How to compute forecast intervals?
Most research is concerned about relativeperformance of existing methods.
There is no research on how to deal withforecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.
Existing methods:ã Bottom-upã Top-downã Middle-out
How to compute forecast intervals?
Most research is concerned about relativeperformance of existing methods.
There is no research on how to deal withforecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Hierarchical/grouped time seriesForecasts should be “aggregateconsistent”, unbiased, minimum variance.
Existing methods:ã Bottom-upã Top-downã Middle-out
How to compute forecast intervals?
Most research is concerned about relativeperformance of existing methods.
There is no research on how to deal withforecasting grouped time series.
Forecasting hierarchical time series Hierarchical time series 9
Top-down method
Forecasting hierarchical time series Hierarchical time series 10
Advantages
Works well inpresence of lowcounts.
Single forecastingmodel easy tobuild
Provides reliableforecasts foraggregate levels.
Disadvantages
Loss of information,especiallyindividual seriesdynamics.
Distribution offorecasts to lowerlevels can bedifficult
No predictionintervals
Top-down method
Forecasting hierarchical time series Hierarchical time series 10
Advantages
Works well inpresence of lowcounts.
Single forecastingmodel easy tobuild
Provides reliableforecasts foraggregate levels.
Disadvantages
Loss of information,especiallyindividual seriesdynamics.
Distribution offorecasts to lowerlevels can bedifficult
No predictionintervals
Top-down method
Forecasting hierarchical time series Hierarchical time series 10
Advantages
Works well inpresence of lowcounts.
Single forecastingmodel easy tobuild
Provides reliableforecasts foraggregate levels.
Disadvantages
Loss of information,especiallyindividual seriesdynamics.
Distribution offorecasts to lowerlevels can bedifficult
No predictionintervals
Top-down method
Forecasting hierarchical time series Hierarchical time series 10
Advantages
Works well inpresence of lowcounts.
Single forecastingmodel easy tobuild
Provides reliableforecasts foraggregate levels.
Disadvantages
Loss of information,especiallyindividual seriesdynamics.
Distribution offorecasts to lowerlevels can bedifficult
No predictionintervals
Top-down method
Forecasting hierarchical time series Hierarchical time series 10
Advantages
Works well inpresence of lowcounts.
Single forecastingmodel easy tobuild
Provides reliableforecasts foraggregate levels.
Disadvantages
Loss of information,especiallyindividual seriesdynamics.
Distribution offorecasts to lowerlevels can bedifficult
No predictionintervals
Top-down method
Forecasting hierarchical time series Hierarchical time series 10
Advantages
Works well inpresence of lowcounts.
Single forecastingmodel easy tobuild
Provides reliableforecasts foraggregate levels.
Disadvantages
Loss of information,especiallyindividual seriesdynamics.
Distribution offorecasts to lowerlevels can bedifficult
No predictionintervals
Bottom-up method
Forecasting hierarchical time series Hierarchical time series 11
Advantages
No loss ofinformation.
Better capturesdynamics ofindividual series.
Disadvantages
Large number ofseries to beforecast.
Constructingforecasting modelsis harder becauseof noisy data atbottom level.
No predictionintervals
Bottom-up method
Forecasting hierarchical time series Hierarchical time series 11
Advantages
No loss ofinformation.
Better capturesdynamics ofindividual series.
Disadvantages
Large number ofseries to beforecast.
Constructingforecasting modelsis harder becauseof noisy data atbottom level.
No predictionintervals
Bottom-up method
Forecasting hierarchical time series Hierarchical time series 11
Advantages
No loss ofinformation.
Better capturesdynamics ofindividual series.
Disadvantages
Large number ofseries to beforecast.
Constructingforecasting modelsis harder becauseof noisy data atbottom level.
No predictionintervals
Bottom-up method
Forecasting hierarchical time series Hierarchical time series 11
Advantages
No loss ofinformation.
Better capturesdynamics ofindividual series.
Disadvantages
Large number ofseries to beforecast.
Constructingforecasting modelsis harder becauseof noisy data atbottom level.
No predictionintervals
Bottom-up method
Forecasting hierarchical time series Hierarchical time series 11
Advantages
No loss ofinformation.
Better capturesdynamics ofindividual series.
Disadvantages
Large number ofseries to beforecast.
Constructingforecasting modelsis harder becauseof noisy data atbottom level.
No predictionintervals
A new approach
We propose a new statistical framework forforecasting hierarchical time series which:
1 provides point forecasts that areconsistent across the hierarchy;
2 allows for correlations and interactionbetween series at each level;
3 provides estimates of forecast uncertaintywhich are consistent across the hierarchy;
4 allows for ad hoc adjustments andinclusion of covariates at any level.
Forecasting hierarchical time series Hierarchical time series 12
A new approach
We propose a new statistical framework forforecasting hierarchical time series which:
1 provides point forecasts that areconsistent across the hierarchy;
2 allows for correlations and interactionbetween series at each level;
3 provides estimates of forecast uncertaintywhich are consistent across the hierarchy;
4 allows for ad hoc adjustments andinclusion of covariates at any level.
Forecasting hierarchical time series Hierarchical time series 12
A new approach
We propose a new statistical framework forforecasting hierarchical time series which:
1 provides point forecasts that areconsistent across the hierarchy;
2 allows for correlations and interactionbetween series at each level;
3 provides estimates of forecast uncertaintywhich are consistent across the hierarchy;
4 allows for ad hoc adjustments andinclusion of covariates at any level.
Forecasting hierarchical time series Hierarchical time series 12
A new approach
We propose a new statistical framework forforecasting hierarchical time series which:
1 provides point forecasts that areconsistent across the hierarchy;
2 allows for correlations and interactionbetween series at each level;
3 provides estimates of forecast uncertaintywhich are consistent across the hierarchy;
4 allows for ad hoc adjustments andinclusion of covariates at any level.
Forecasting hierarchical time series Hierarchical time series 12
Hierarchical data
Total
A B C
Forecasting hierarchical time series Hierarchical time series 13
Yt : observed aggregate of allseries at time t.
YX,t : observation on series X attime t.
Bt : vector of all series atbottom level in time t.
Hierarchical data
Total
A B C
Forecasting hierarchical time series Hierarchical time series 13
Yt : observed aggregate of allseries at time t.
YX,t : observation on series X attime t.
Bt : vector of all series atbottom level in time t.
Hierarchical data
Total
A B C
Y t = [Yt, YA,t, YB,t, YC,t]′ =
1 1 11 0 00 1 00 0 1
YA,tYB,tYC,t
Forecasting hierarchical time series Hierarchical time series 13
Yt : observed aggregate of allseries at time t.
YX,t : observation on series X attime t.
Bt : vector of all series atbottom level in time t.
Hierarchical data
Total
A B C
Y t = [Yt, YA,t, YB,t, YC,t]′ =
1 1 11 0 00 1 00 0 1
︸ ︷︷ ︸
S
YA,tYB,tYC,t
Forecasting hierarchical time series Hierarchical time series 13
Yt : observed aggregate of allseries at time t.
YX,t : observation on series X attime t.
Bt : vector of all series atbottom level in time t.
Hierarchical data
Total
A B C
Y t = [Yt, YA,t, YB,t, YC,t]′ =
1 1 11 0 00 1 00 0 1
︸ ︷︷ ︸
S
YA,tYB,tYC,t
︸ ︷︷ ︸
Bt
Forecasting hierarchical time series Hierarchical time series 13
Yt : observed aggregate of allseries at time t.
YX,t : observation on series X attime t.
Bt : vector of all series atbottom level in time t.
Hierarchical data
Total
A B C
Y t = [Yt, YA,t, YB,t, YC,t]′ =
1 1 11 0 00 1 00 0 1
︸ ︷︷ ︸
S
YA,tYB,tYC,t
︸ ︷︷ ︸
BtY t = SBt
Forecasting hierarchical time series Hierarchical time series 13
Yt : observed aggregate of allseries at time t.
YX,t : observation on series X attime t.
Bt : vector of all series atbottom level in time t.
Hierarchical dataTotal
A
AX AY AZ
B
BX BY BZ
C
CX CY CZ
Y t =
YtYA,tYB,tYC,tYAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t
=
1 1 1 1 1 1 1 1 11 1 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 11 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 00 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 00 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1
︸ ︷︷ ︸
S
YAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t
︸ ︷︷ ︸
Bt
Forecasting hierarchical time series Hierarchical time series 14
Hierarchical dataTotal
A
AX AY AZ
B
BX BY BZ
C
CX CY CZ
Y t =
YtYA,tYB,tYC,tYAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t
=
1 1 1 1 1 1 1 1 11 1 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 11 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 00 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 00 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1
︸ ︷︷ ︸
S
YAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t
︸ ︷︷ ︸
Bt
Forecasting hierarchical time series Hierarchical time series 14
Hierarchical dataTotal
A
AX AY AZ
B
BX BY BZ
C
CX CY CZ
Y t =
YtYA,tYB,tYC,tYAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t
=
1 1 1 1 1 1 1 1 11 1 1 0 0 0 0 0 00 0 0 1 1 1 0 0 00 0 0 0 0 0 1 1 11 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 00 0 1 0 0 0 0 0 00 0 0 1 0 0 0 0 00 0 0 0 1 0 0 0 00 0 0 0 0 1 0 0 00 0 0 0 0 0 1 0 00 0 0 0 0 0 0 1 00 0 0 0 0 0 0 0 1
︸ ︷︷ ︸
S
YAX,tYAY,tYAZ,tYBX,tYBY,tYBZ,tYCX,tYCY,tYCZ,t
︸ ︷︷ ︸
Bt
Forecasting hierarchical time series Hierarchical time series 14
Y t = SBt
Grouped dataTotal
A
AX AY
B
BX BY
Total
X
AX BX
Y
AY BY
Y t =
YtYA,tYB,tYX,tYY,tYAX,tYAY,tYBX,tYBY,t
=
1 1 1 11 1 0 00 0 1 11 0 1 00 1 0 11 0 0 00 1 0 00 0 1 00 0 0 1
︸ ︷︷ ︸
S
YAX,tYAY,tYBX,tYBY,t
︸ ︷︷ ︸
Bt
Forecasting hierarchical time series Hierarchical time series 15
Grouped dataTotal
A
AX AY
B
BX BY
Total
X
AX BX
Y
AY BY
Y t =
YtYA,tYB,tYX,tYY,tYAX,tYAY,tYBX,tYBY,t
=
1 1 1 11 1 0 00 0 1 11 0 1 00 1 0 11 0 0 00 1 0 00 0 1 00 0 0 1
︸ ︷︷ ︸
S
YAX,tYAY,tYBX,tYBY,t
︸ ︷︷ ︸
Bt
Forecasting hierarchical time series Hierarchical time series 15
Grouped dataTotal
A
AX AY
B
BX BY
Total
X
AX BX
Y
AY BY
Y t =
YtYA,tYB,tYX,tYY,tYAX,tYAY,tYBX,tYBY,t
=
1 1 1 11 1 0 00 0 1 11 0 1 00 1 0 11 0 0 00 1 0 00 0 1 00 0 0 1
︸ ︷︷ ︸
S
YAX,tYAY,tYBX,tYBY,t
︸ ︷︷ ︸
Bt
Forecasting hierarchical time series Hierarchical time series 15
Y t = SBt
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series Forecasting framework 16
Forecasting notation
Let Yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as Y t.(They may not add up.)
Hierarchical forecasting methods of the form:Yn(h) = SPYn(h)
for some matrix P.
P extracts and combines base forecastsYn(h) to get bottom-level forecasts.S adds them upRevised reconciled forecasts: Yn(h).
Forecasting hierarchical time series Forecasting framework 17
Forecasting notation
Let Yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as Y t.(They may not add up.)
Hierarchical forecasting methods of the form:Yn(h) = SPYn(h)
for some matrix P.
P extracts and combines base forecastsYn(h) to get bottom-level forecasts.S adds them upRevised reconciled forecasts: Yn(h).
Forecasting hierarchical time series Forecasting framework 17
Forecasting notation
Let Yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as Y t.(They may not add up.)
Hierarchical forecasting methods of the form:Yn(h) = SPYn(h)
for some matrix P.
P extracts and combines base forecastsYn(h) to get bottom-level forecasts.S adds them upRevised reconciled forecasts: Yn(h).
Forecasting hierarchical time series Forecasting framework 17
Forecasting notation
Let Yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as Y t.(They may not add up.)
Hierarchical forecasting methods of the form:Yn(h) = SPYn(h)
for some matrix P.
P extracts and combines base forecastsYn(h) to get bottom-level forecasts.S adds them upRevised reconciled forecasts: Yn(h).
Forecasting hierarchical time series Forecasting framework 17
Forecasting notation
Let Yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as Y t.(They may not add up.)
Hierarchical forecasting methods of the form:Yn(h) = SPYn(h)
for some matrix P.
P extracts and combines base forecastsYn(h) to get bottom-level forecasts.S adds them upRevised reconciled forecasts: Yn(h).
Forecasting hierarchical time series Forecasting framework 17
Forecasting notation
Let Yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as Y t.(They may not add up.)
Hierarchical forecasting methods of the form:Yn(h) = SPYn(h)
for some matrix P.
P extracts and combines base forecastsYn(h) to get bottom-level forecasts.S adds them upRevised reconciled forecasts: Yn(h).
Forecasting hierarchical time series Forecasting framework 17
Bottom-up forecasts
Yn(h) = SPYn(h)
Bottom-up forecasts are obtained using
P = [0 | I] ,
where 0 is null matrix and I is identity matrix.
P matrix extracts only bottom-levelforecasts from Yn(h)
S adds them up to give the bottom-upforecasts.
Forecasting hierarchical time series Forecasting framework 18
Bottom-up forecasts
Yn(h) = SPYn(h)
Bottom-up forecasts are obtained using
P = [0 | I] ,
where 0 is null matrix and I is identity matrix.
P matrix extracts only bottom-levelforecasts from Yn(h)
S adds them up to give the bottom-upforecasts.
Forecasting hierarchical time series Forecasting framework 18
Bottom-up forecasts
Yn(h) = SPYn(h)
Bottom-up forecasts are obtained using
P = [0 | I] ,
where 0 is null matrix and I is identity matrix.
P matrix extracts only bottom-levelforecasts from Yn(h)
S adds them up to give the bottom-upforecasts.
Forecasting hierarchical time series Forecasting framework 18
Top-down forecasts
Yn(h) = SPYn(h)
Top-down forecasts are obtained using
P = [p | 0]
where p = [p1, p2, . . . , pmK]′ is a vector of
proportions that sum to one.
P distributes forecasts of the aggregate tothe lowest level series.
Different methods of top-down forecastinglead to different proportionality vectors p.
Forecasting hierarchical time series Forecasting framework 19
Top-down forecasts
Yn(h) = SPYn(h)
Top-down forecasts are obtained using
P = [p | 0]
where p = [p1, p2, . . . , pmK]′ is a vector of
proportions that sum to one.
P distributes forecasts of the aggregate tothe lowest level series.
Different methods of top-down forecastinglead to different proportionality vectors p.
Forecasting hierarchical time series Forecasting framework 19
Top-down forecasts
Yn(h) = SPYn(h)
Top-down forecasts are obtained using
P = [p | 0]
where p = [p1, p2, . . . , pmK]′ is a vector of
proportions that sum to one.
P distributes forecasts of the aggregate tothe lowest level series.
Different methods of top-down forecastinglead to different proportionality vectors p.
Forecasting hierarchical time series Forecasting framework 19
General properties: bias
Yn(h) = SPYn(h)
Assume: base forecasts Yn(h) are unbiased:E[Yn(h)|Y1, . . . ,Yn] = E[Yn+h|Y1, . . . ,Yn]
Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|Y1, . . . ,Yn].Then E[Yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[Yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
Yn(h) = SPYn(h)
Assume: base forecasts Yn(h) are unbiased:E[Yn(h)|Y1, . . . ,Yn] = E[Yn+h|Y1, . . . ,Yn]
Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|Y1, . . . ,Yn].Then E[Yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[Yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
Yn(h) = SPYn(h)
Assume: base forecasts Yn(h) are unbiased:E[Yn(h)|Y1, . . . ,Yn] = E[Yn+h|Y1, . . . ,Yn]
Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|Y1, . . . ,Yn].Then E[Yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[Yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
Yn(h) = SPYn(h)
Assume: base forecasts Yn(h) are unbiased:E[Yn(h)|Y1, . . . ,Yn] = E[Yn+h|Y1, . . . ,Yn]
Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|Y1, . . . ,Yn].Then E[Yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[Yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
Yn(h) = SPYn(h)
Assume: base forecasts Yn(h) are unbiased:E[Yn(h)|Y1, . . . ,Yn] = E[Yn+h|Y1, . . . ,Yn]
Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|Y1, . . . ,Yn].Then E[Yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[Yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
Yn(h) = SPYn(h)
Assume: base forecasts Yn(h) are unbiased:E[Yn(h)|Y1, . . . ,Yn] = E[Yn+h|Y1, . . . ,Yn]
Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|Y1, . . . ,Yn].Then E[Yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[Yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
Yn(h) = SPYn(h)
Assume: base forecasts Yn(h) are unbiased:E[Yn(h)|Y1, . . . ,Yn] = E[Yn+h|Y1, . . . ,Yn]
Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|Y1, . . . ,Yn].Then E[Yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[Yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: bias
Yn(h) = SPYn(h)
Assume: base forecasts Yn(h) are unbiased:E[Yn(h)|Y1, . . . ,Yn] = E[Yn+h|Y1, . . . ,Yn]
Let Bn(h) be bottom level base forecastswith βn(h) = E[Bn(h)|Y1, . . . ,Yn].Then E[Yn(h)] = Sβn(h).We want the revised forecasts to be unbiased:E[Yn(h)] = SPSβn(h) = Sβn(h).Result will hold provided SPS = S.True for bottom-up, but not for any top-downmethod or middle-out method.
Forecasting hierarchical time series Forecasting framework 20
General properties: variance
Yn(h) = SPYn(h)
Let variance of base forecasts Yn(h) be givenby
Σh = V[Yn(h)|Y1, . . . ,Yn]
Then the variance of the revised forecasts isgiven by
V[Yn(h)|Y1, . . . ,Yn] = SPΣhP′S′.
This is a general result for all existing methods.Forecasting hierarchical time series Forecasting framework 21
General properties: variance
Yn(h) = SPYn(h)
Let variance of base forecasts Yn(h) be givenby
Σh = V[Yn(h)|Y1, . . . ,Yn]
Then the variance of the revised forecasts isgiven by
V[Yn(h)|Y1, . . . ,Yn] = SPΣhP′S′.
This is a general result for all existing methods.Forecasting hierarchical time series Forecasting framework 21
General properties: variance
Yn(h) = SPYn(h)
Let variance of base forecasts Yn(h) be givenby
Σh = V[Yn(h)|Y1, . . . ,Yn]
Then the variance of the revised forecasts isgiven by
V[Yn(h)|Y1, . . . ,Yn] = SPΣhP′S′.
This is a general result for all existing methods.Forecasting hierarchical time series Forecasting framework 21
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series Optimal forecasts 22
Forecasts
Key idea: forecast reconciliationå Ignore structural constraints and forecast
every series of interest independently.
å Adjust forecasts to impose constraints.
Let Yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as Y t.
Y t = SBt . So Yn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . ,Yn].εh has zero mean and covariance Σh.Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Forecasts
Key idea: forecast reconciliationå Ignore structural constraints and forecast
every series of interest independently.
å Adjust forecasts to impose constraints.
Let Yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as Y t.
Y t = SBt . So Yn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . ,Yn].εh has zero mean and covariance Σh.Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Forecasts
Key idea: forecast reconciliationå Ignore structural constraints and forecast
every series of interest independently.
å Adjust forecasts to impose constraints.
Let Yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as Y t.
Y t = SBt . So Yn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . ,Yn].εh has zero mean and covariance Σh.Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Forecasts
Key idea: forecast reconciliationå Ignore structural constraints and forecast
every series of interest independently.
å Adjust forecasts to impose constraints.
Let Yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as Y t.
Y t = SBt . So Yn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . ,Yn].εh has zero mean and covariance Σh.Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Forecasts
Key idea: forecast reconciliationå Ignore structural constraints and forecast
every series of interest independently.
å Adjust forecasts to impose constraints.
Let Yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as Y t.
Y t = SBt . So Yn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . ,Yn].εh has zero mean and covariance Σh.Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Forecasts
Key idea: forecast reconciliationå Ignore structural constraints and forecast
every series of interest independently.
å Adjust forecasts to impose constraints.
Let Yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as Y t.
Y t = SBt . So Yn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . ,Yn].εh has zero mean and covariance Σh.Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Forecasts
Key idea: forecast reconciliationå Ignore structural constraints and forecast
every series of interest independently.
å Adjust forecasts to impose constraints.
Let Yn(h) be vector of initial h-step forecasts,made at time n, stacked in same order as Y t.
Y t = SBt . So Yn(h) = Sβn(h) + εh .
βn(h) = E[Bn+h | Y1, . . . ,Yn].εh has zero mean and covariance Σh.Estimate βn(h) using GLS?
Forecasting hierarchical time series Optimal forecasts 23
Optimal combination forecasts
Yn(h) = Sβn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Σ†h is generalized inverse of Σh.
Optimal P = (S′Σ†hS)−1S′Σ†h
Revised forecasts unbiased: SPS = S.Revised forecasts minimum variance:
V[Yn(h)|Y1, . . . ,Yn] = SPΣhP′S′
= S(S′Σ†hS)−1S′
Problem: Σh hard to estimate.Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
Yn(h) = Sβn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Initial forecasts
Σ†h is generalized inverse of Σh.
Optimal P = (S′Σ†hS)−1S′Σ†h
Revised forecasts unbiased: SPS = S.Revised forecasts minimum variance:
V[Yn(h)|Y1, . . . ,Yn] = SPΣhP′S′
= S(S′Σ†hS)−1S′
Problem: Σh hard to estimate.Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
Yn(h) = Sβn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Revised forecasts Initial forecasts
Σ†h is generalized inverse of Σh.
Optimal P = (S′Σ†hS)−1S′Σ†h
Revised forecasts unbiased: SPS = S.Revised forecasts minimum variance:
V[Yn(h)|Y1, . . . ,Yn] = SPΣhP′S′
= S(S′Σ†hS)−1S′
Problem: Σh hard to estimate.Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
Yn(h) = Sβn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Revised forecasts Initial forecasts
Σ†h is generalized inverse of Σh.
Optimal P = (S′Σ†hS)−1S′Σ†h
Revised forecasts unbiased: SPS = S.Revised forecasts minimum variance:
V[Yn(h)|Y1, . . . ,Yn] = SPΣhP′S′
= S(S′Σ†hS)−1S′
Problem: Σh hard to estimate.Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
Yn(h) = Sβn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Revised forecasts Initial forecasts
Σ†h is generalized inverse of Σh.
Optimal P = (S′Σ†hS)−1S′Σ†h
Revised forecasts unbiased: SPS = S.Revised forecasts minimum variance:
V[Yn(h)|Y1, . . . ,Yn] = SPΣhP′S′
= S(S′Σ†hS)−1S′
Problem: Σh hard to estimate.Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
Yn(h) = Sβn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Revised forecasts Initial forecasts
Σ†h is generalized inverse of Σh.
Optimal P = (S′Σ†hS)−1S′Σ†h
Revised forecasts unbiased: SPS = S.Revised forecasts minimum variance:
V[Yn(h)|Y1, . . . ,Yn] = SPΣhP′S′
= S(S′Σ†hS)−1S′
Problem: Σh hard to estimate.Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
Yn(h) = Sβn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Revised forecasts Initial forecasts
Σ†h is generalized inverse of Σh.
Optimal P = (S′Σ†hS)−1S′Σ†h
Revised forecasts unbiased: SPS = S.Revised forecasts minimum variance:
V[Yn(h)|Y1, . . . ,Yn] = SPΣhP′S′
= S(S′Σ†hS)−1S′
Problem: Σh hard to estimate.Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
Yn(h) = Sβn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Revised forecasts Initial forecasts
Σ†h is generalized inverse of Σh.
Optimal P = (S′Σ†hS)−1S′Σ†h
Revised forecasts unbiased: SPS = S.Revised forecasts minimum variance:
V[Yn(h)|Y1, . . . ,Yn] = SPΣhP′S′
= S(S′Σ†hS)−1S′
Problem: Σh hard to estimate.Forecasting hierarchical time series Optimal forecasts 24
Optimal combination forecasts
Yn(h) = Sβn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Revised forecasts Initial forecasts
Σ†h is generalized inverse of Σh.
Optimal P = (S′Σ†hS)−1S′Σ†h
Revised forecasts unbiased: SPS = S.Revised forecasts minimum variance:
V[Yn(h)|Y1, . . . ,Yn] = SPΣhP′S′
= S(S′Σ†hS)−1S′
Problem: Σh hard to estimate.Forecasting hierarchical time series Optimal forecasts 24
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series Approximately optimal forecasts 25
Optimal combination forecasts
Yn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Revised forecasts Base forecasts
Solution 1: OLSAssume εh ≈ SεB,h where εB,h is theforecast error at bottom level.
Then Σh ≈ SΩhS′ where Ωh = V(εB,h).
If Moore-Penrose generalized inverse used,then (S′Σ†hS)
−1S′Σ†h = (S′S)−1S′.
Yn(h) = S(S′S)−1S′Yn(h)Forecasting hierarchical time series Approximately optimal forecasts 26
Optimal combination forecasts
Yn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Revised forecasts Base forecasts
Solution 1: OLSAssume εh ≈ SεB,h where εB,h is theforecast error at bottom level.
Then Σh ≈ SΩhS′ where Ωh = V(εB,h).
If Moore-Penrose generalized inverse used,then (S′Σ†hS)
−1S′Σ†h = (S′S)−1S′.
Yn(h) = S(S′S)−1S′Yn(h)Forecasting hierarchical time series Approximately optimal forecasts 26
Optimal combination forecasts
Yn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Revised forecasts Base forecasts
Solution 1: OLSAssume εh ≈ SεB,h where εB,h is theforecast error at bottom level.
Then Σh ≈ SΩhS′ where Ωh = V(εB,h).
If Moore-Penrose generalized inverse used,then (S′Σ†hS)
−1S′Σ†h = (S′S)−1S′.
Yn(h) = S(S′S)−1S′Yn(h)Forecasting hierarchical time series Approximately optimal forecasts 26
Optimal combination forecasts
Yn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Revised forecasts Base forecasts
Solution 1: OLSAssume εh ≈ SεB,h where εB,h is theforecast error at bottom level.
Then Σh ≈ SΩhS′ where Ωh = V(εB,h).
If Moore-Penrose generalized inverse used,then (S′Σ†hS)
−1S′Σ†h = (S′S)−1S′.
Yn(h) = S(S′S)−1S′Yn(h)Forecasting hierarchical time series Approximately optimal forecasts 26
Optimal combination forecasts
Yn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Revised forecasts Base forecasts
Solution 1: OLSAssume εh ≈ SεB,h where εB,h is theforecast error at bottom level.
Then Σh ≈ SΩhS′ where Ωh = V(εB,h).
If Moore-Penrose generalized inverse used,then (S′Σ†hS)
−1S′Σ†h = (S′S)−1S′.
Yn(h) = S(S′S)−1S′Yn(h)Forecasting hierarchical time series Approximately optimal forecasts 26
Optimal combination forecasts
Yn(h) = S(S′Σ†hS)−1S′Σ†hYn(h)
Revised forecasts Base forecasts
Solution 1: OLSAssume εh ≈ SεB,h where εB,h is theforecast error at bottom level.
Then Σh ≈ SΩhS′ where Ωh = V(εB,h).
If Moore-Penrose generalized inverse used,then (S′Σ†hS)
−1S′Σ†h = (S′S)−1S′.
Yn(h) = S(S′S)−1S′Yn(h)Forecasting hierarchical time series Approximately optimal forecasts 26
Optimal combination forecasts
Yn(h) = S(S′S)−1S′Yn(h)
GLS = OLS.
Optimal weighted average of initialforecasts.
Optimal reconciliation weights areS(S′S)−1S′.
Weights are independent of the data andof the covariance structure of thehierarchy!
Forecasting hierarchical time series Approximately optimal forecasts 27
Optimal combination forecasts
Yn(h) = S(S′S)−1S′Yn(h)
GLS = OLS.
Optimal weighted average of initialforecasts.
Optimal reconciliation weights areS(S′S)−1S′.
Weights are independent of the data andof the covariance structure of thehierarchy!
Forecasting hierarchical time series Approximately optimal forecasts 27
Optimal combination forecasts
Yn(h) = S(S′S)−1S′Yn(h)
GLS = OLS.
Optimal weighted average of initialforecasts.
Optimal reconciliation weights areS(S′S)−1S′.
Weights are independent of the data andof the covariance structure of thehierarchy!
Forecasting hierarchical time series Approximately optimal forecasts 27
Optimal combination forecasts
Yn(h) = S(S′S)−1S′Yn(h)
GLS = OLS.
Optimal weighted average of initialforecasts.
Optimal reconciliation weights areS(S′S)−1S′.
Weights are independent of the data andof the covariance structure of thehierarchy!
Forecasting hierarchical time series Approximately optimal forecasts 27
Optimal combination forecasts
Forecasting hierarchical time series Approximately optimal forecasts 28
Yn(h) = S(S′S)−1S′Yn(h)Total
A B C
Optimal combination forecasts
Forecasting hierarchical time series Approximately optimal forecasts 28
Yn(h) = S(S′S)−1S′Yn(h)Total
A B C
Weights:
S(S′S)−1S′ =
0.75 0.25 0.25 0.250.25 0.75 −0.25 −0.250.25 −0.25 0.75 −0.250.25 −0.25 −0.25 0.75
Optimal combination forecasts
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Weights: S(S′S)−1S′ =
0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.080.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.060.23 −0.17 0.58 −0.17 −0.06 −0.06 −0.06 0.19 0.19 0.19 −0.06 −0.06 −0.060.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.190.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.020.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.020.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.020.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.020.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.020.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.020.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.270.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.270.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73
Forecasting hierarchical time series Approximately optimal forecasts 29
Optimal combination forecasts
Total
A
AA AB AC
B
BA BB BC
C
CA CB CC
Weights: S(S′S)−1S′ =
0.69 0.23 0.23 0.23 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.08 0.080.23 0.58 −0.17 −0.17 0.19 0.19 0.19 −0.06 −0.06 −0.06 −0.06 −0.06 −0.060.23 −0.17 0.58 −0.17 −0.06 −0.06 −0.06 0.19 0.19 0.19 −0.06 −0.06 −0.060.23 −0.17 −0.17 0.58 −0.06 −0.06 −0.06 −0.06 −0.06 −0.06 0.19 0.19 0.190.08 0.19 −0.06 −0.06 0.73 −0.27 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.020.08 0.19 −0.06 −0.06 −0.27 0.73 −0.27 −0.02 −0.02 −0.02 −0.02 −0.02 −0.020.08 0.19 −0.06 −0.06 −0.27 −0.27 0.73 −0.02 −0.02 −0.02 −0.02 −0.02 −0.020.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 0.73 −0.27 −0.27 −0.02 −0.02 −0.020.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 0.73 −0.27 −0.02 −0.02 −0.020.08 −0.06 0.19 −0.06 −0.02 −0.02 −0.02 −0.27 −0.27 0.73 −0.02 −0.02 −0.020.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 0.73 −0.27 −0.270.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 0.73 −0.270.08 −0.06 −0.06 0.19 −0.02 −0.02 −0.02 −0.02 −0.02 −0.02 −0.27 −0.27 0.73
Forecasting hierarchical time series Approximately optimal forecasts 29
Features
Forget “bottom up” or “top down”. Thisapproach combines all forecasts optimally.
Method outperforms bottom-up andtop-down, especially for middle levels.
Covariates can be included in initial forecasts.
Adjustments can be made to initial forecastsat any level.
Very simple and flexible method. Can workwith any hierarchical or grouped time series.
Conceptually easy to implement: OLS onbase forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 30
Features
Forget “bottom up” or “top down”. Thisapproach combines all forecasts optimally.
Method outperforms bottom-up andtop-down, especially for middle levels.
Covariates can be included in initial forecasts.
Adjustments can be made to initial forecastsat any level.
Very simple and flexible method. Can workwith any hierarchical or grouped time series.
Conceptually easy to implement: OLS onbase forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 30
Features
Forget “bottom up” or “top down”. Thisapproach combines all forecasts optimally.
Method outperforms bottom-up andtop-down, especially for middle levels.
Covariates can be included in initial forecasts.
Adjustments can be made to initial forecastsat any level.
Very simple and flexible method. Can workwith any hierarchical or grouped time series.
Conceptually easy to implement: OLS onbase forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 30
Features
Forget “bottom up” or “top down”. Thisapproach combines all forecasts optimally.
Method outperforms bottom-up andtop-down, especially for middle levels.
Covariates can be included in initial forecasts.
Adjustments can be made to initial forecastsat any level.
Very simple and flexible method. Can workwith any hierarchical or grouped time series.
Conceptually easy to implement: OLS onbase forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 30
Features
Forget “bottom up” or “top down”. Thisapproach combines all forecasts optimally.
Method outperforms bottom-up andtop-down, especially for middle levels.
Covariates can be included in initial forecasts.
Adjustments can be made to initial forecastsat any level.
Very simple and flexible method. Can workwith any hierarchical or grouped time series.
Conceptually easy to implement: OLS onbase forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 30
Features
Forget “bottom up” or “top down”. Thisapproach combines all forecasts optimally.
Method outperforms bottom-up andtop-down, especially for middle levels.
Covariates can be included in initial forecasts.
Adjustments can be made to initial forecastsat any level.
Very simple and flexible method. Can workwith any hierarchical or grouped time series.
Conceptually easy to implement: OLS onbase forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 30
Challenges
Computational difficulties in bighierarchies due to size of the S matrix andnon-singular behavior of (S′S).Need to estimate covariance matrix toproduce prediction intervals.Assumption might be unrealistic.Ignores covariance matrix in computingpoint forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 31
Yn(h) = S(S′S)−1S′Yn(h)
Challenges
Computational difficulties in bighierarchies due to size of the S matrix andnon-singular behavior of (S′S).Need to estimate covariance matrix toproduce prediction intervals.Assumption might be unrealistic.Ignores covariance matrix in computingpoint forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 31
Yn(h) = S(S′S)−1S′Yn(h)
Challenges
Computational difficulties in bighierarchies due to size of the S matrix andnon-singular behavior of (S′S).Need to estimate covariance matrix toproduce prediction intervals.Assumption might be unrealistic.Ignores covariance matrix in computingpoint forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 31
Yn(h) = S(S′S)−1S′Yn(h)
Challenges
Computational difficulties in bighierarchies due to size of the S matrix andnon-singular behavior of (S′S).Need to estimate covariance matrix toproduce prediction intervals.Assumption might be unrealistic.Ignores covariance matrix in computingpoint forecasts.
Forecasting hierarchical time series Approximately optimal forecasts 31
Yn(h) = S(S′S)−1S′Yn(h)
Optimal combination forecasts
Solution 2: RescalingSuppose we rescale the original forecastsby Λ, reconcile using OLS, and backscale:
Y∗n(h) = S(S′Λ2S)−1S′Λ2Yn(h).
If Λ =(Σ†h)1/2
, we get the GLS solution.
Approximately optimal solution:
Λ = diagonal(Σ†1)1/2
That is, Λ contains inverse one-stepforecast standard deviations.
Forecasting hierarchical time series Approximately optimal forecasts 32
Yn(h) = S(S′S)−1S′Yn(h)
Optimal combination forecasts
Solution 2: RescalingSuppose we rescale the original forecastsby Λ, reconcile using OLS, and backscale:
Y∗n(h) = S(S′Λ2S)−1S′Λ2Yn(h).
If Λ =(Σ†h)1/2
, we get the GLS solution.
Approximately optimal solution:
Λ = diagonal(Σ†1)1/2
That is, Λ contains inverse one-stepforecast standard deviations.
Forecasting hierarchical time series Approximately optimal forecasts 32
Yn(h) = S(S′S)−1S′Yn(h)
Optimal combination forecasts
Solution 2: RescalingSuppose we rescale the original forecastsby Λ, reconcile using OLS, and backscale:
Y∗n(h) = S(S′Λ2S)−1S′Λ2Yn(h).
If Λ =(Σ†h)1/2
, we get the GLS solution.
Approximately optimal solution:
Λ = diagonal(Σ†1)1/2
That is, Λ contains inverse one-stepforecast standard deviations.
Forecasting hierarchical time series Approximately optimal forecasts 32
Yn(h) = S(S′S)−1S′Yn(h)
Optimal combination forecasts
Solution 2: RescalingSuppose we rescale the original forecastsby Λ, reconcile using OLS, and backscale:
Y∗n(h) = S(S′Λ2S)−1S′Λ2Yn(h).
If Λ =(Σ†h)1/2
, we get the GLS solution.
Approximately optimal solution:
Λ = diagonal(Σ†1)1/2
That is, Λ contains inverse one-stepforecast standard deviations.
Forecasting hierarchical time series Approximately optimal forecasts 32
Yn(h) = S(S′S)−1S′Yn(h)
Optimal combination forecasts
Solution 3: AveragingIf the bottom level error series areapproximately uncorrelated and havesimilar variances, then Λ is inverselyproportional to the number of seriesmaking up each element of Y.
So set Λ to be the inverse row sums of S.
Then ΛYn(h) is the average at each noderather than the sum at each node.
Forecasting hierarchical time series Approximately optimal forecasts 33
Y∗n(h) = S(S′Λ2S)−1S′Λ2Yn(h)
Optimal combination forecasts
Solution 3: AveragingIf the bottom level error series areapproximately uncorrelated and havesimilar variances, then Λ is inverselyproportional to the number of seriesmaking up each element of Y.
So set Λ to be the inverse row sums of S.
Then ΛYn(h) is the average at each noderather than the sum at each node.
Forecasting hierarchical time series Approximately optimal forecasts 33
Y∗n(h) = S(S′Λ2S)−1S′Λ2Yn(h)
Optimal combination forecasts
Solution 3: AveragingIf the bottom level error series areapproximately uncorrelated and havesimilar variances, then Λ is inverselyproportional to the number of seriesmaking up each element of Y.
So set Λ to be the inverse row sums of S.
Then ΛYn(h) is the average at each noderather than the sum at each node.
Forecasting hierarchical time series Approximately optimal forecasts 33
Y∗n(h) = S(S′Λ2S)−1S′Λ2Yn(h)
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series Application to Australian tourism 34
Application to Australian tourism
Forecasting hierarchical time series Application to Australian tourism 35
Application to Australian tourism
Forecasting hierarchical time series Application to Australian tourism 35
Quarterly data on visitor nightsDomestic visitor nightsfrom 1998 – 2006Data from: National Visitor Survey,based on annual interviews of 120,000Australians aged 15+, collected byTourism Research Australia.
Application to Australian tourism
Forecasting hierarchical time series Application to Australian tourism 35
Also split by purpose of travel:
Holiday
Visits to friends and relatives
Business
Other
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothing
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothingA,N: Holt’s linear method
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothingA,N: Holt’s linear methodAd,N: Additive damped trend method
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothingA,N: Holt’s linear methodAd,N: Additive damped trend methodM,N: Exponential trend method
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothingA,N: Holt’s linear methodAd,N: Additive damped trend methodM,N: Exponential trend methodMd,N: Multiplicative damped trend method
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothingA,N: Holt’s linear methodAd,N: Additive damped trend methodM,N: Exponential trend methodMd,N: Multiplicative damped trend methodA,A: Additive Holt-Winters’ method
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
N,N: Simple exponential smoothingA,N: Holt’s linear methodAd,N: Additive damped trend methodM,N: Exponential trend methodMd,N: Multiplicative damped trend methodA,A: Additive Holt-Winters’ methodA,M: Multiplicative Holt-Winters’ method
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exponentialsmoothing methods.
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
There are 15 separate exponentialsmoothing methods.Each can have an additive or multiplicativeerror, giving 30 separate models.
Forecasting hierarchical time series Application to Australian tourism 36
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
Examples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing
Examples:A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing↑
TrendExamples:
A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing↑
Trend SeasonalExamples:
A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing ↑
Error Trend SeasonalExamples:
A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing ↑
Error Trend SeasonalExamples:
A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Exponential smoothing methods
Seasonal ComponentTrend N A M
Component (None) (Additive) (Multiplicative)
N (None) N,N N,A N,M
A (Additive) A,N A,A A,M
Ad (Additive damped) Ad,N Ad,A Ad,M
M (Multiplicative) M,N M,A M,M
Md (Multiplicative damped) Md,N Md,A Md,M
General notation E T S : ExponenTial Smoothing ↑
Error Trend SeasonalExamples:
A,N,N: Simple exponential smoothing with additive errorsA,A,N: Holt’s linear method with additive errorsM,A,M: Multiplicative Holt-Winters’ method with multiplicative errorsForecasting hierarchical time series Application to Australian tourism 37
Innovations state space models
å All ETS models can be written ininnovations state space form (IJF, 2002).
å Additive and multiplicative versions givethe same point forecasts but differentprediction intervals.
Automatic forecasting
From Hyndman et al. (IJF, 2002):
Apply each of 30 models that areappropriate to the data. Optimizeparameters and initial values using MLE(or some other criterion).Select best method using AIC:
AIC = −2 log(Likelihood) + 2pwhere p = # parameters.Produce forecasts using best method.Obtain prediction intervals usingunderlying state space model.
Forecasting hierarchical time series Application to Australian tourism 38
Automatic forecasting
From Hyndman et al. (IJF, 2002):
Apply each of 30 models that areappropriate to the data. Optimizeparameters and initial values using MLE(or some other criterion).Select best method using AIC:
AIC = −2 log(Likelihood) + 2pwhere p = # parameters.Produce forecasts using best method.Obtain prediction intervals usingunderlying state space model.
Forecasting hierarchical time series Application to Australian tourism 38
Automatic forecasting
From Hyndman et al. (IJF, 2002):
Apply each of 30 models that areappropriate to the data. Optimizeparameters and initial values using MLE(or some other criterion).Select best method using AIC:
AIC = −2 log(Likelihood) + 2pwhere p = # parameters.Produce forecasts using best method.Obtain prediction intervals usingunderlying state space model.
Forecasting hierarchical time series Application to Australian tourism 38
Automatic forecasting
From Hyndman et al. (IJF, 2002):
Apply each of 30 models that areappropriate to the data. Optimizeparameters and initial values using MLE(or some other criterion).Select best method using AIC:
AIC = −2 log(Likelihood) + 2pwhere p = # parameters.Produce forecasts using best method.Obtain prediction intervals usingunderlying state space model.
Forecasting hierarchical time series Application to Australian tourism 38
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: Total
Year
Vis
itor
nigh
ts
1998 2000 2002 2004 2006 2008
6000
065
000
7000
075
000
8000
085
000
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: NSW
Year
Vis
itor
nigh
ts
1998 2000 2002 2004 2006 2008
1800
022
000
2600
030
000
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: VIC
Year
Vis
itor
nigh
ts
1998 2000 2002 2004 2006 2008
1000
012
000
1400
016
000
1800
0
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: Nth.Coast.NSW
Year
Vis
itor
nigh
ts
1998 2000 2002 2004 2006 2008
5000
6000
7000
8000
9000
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: Metro.QLD
Year
Vis
itor
nigh
ts
1998 2000 2002 2004 2006 2008
8000
9000
1100
013
000
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: Sth.WA
Year
Vis
itor
nigh
ts
1998 2000 2002 2004 2006 2008
400
600
800
1000
1200
1400
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: X201.Melbourne
Year
Vis
itor
nigh
ts
1998 2000 2002 2004 2006 2008
4000
4500
5000
5500
6000
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: X402.Murraylands
Year
Vis
itor
nigh
ts
1998 2000 2002 2004 2006 2008
010
020
030
0
Base forecasts
Forecasting hierarchical time series Application to Australian tourism 39
Domestic tourism forecasts: X809.Daly
Year
Vis
itor
nigh
ts
1998 2000 2002 2004 2006 2008
020
4060
8010
0
Hierarchy: states, zones, regions
Forecast Horizon (h)MAPE 1 2 4 6 8 Average
Top Level: Australia
Bottom-up 3.79 3.58 4.01 4.55 4.24 4.06OLS 3.83 3.66 3.88 4.19 4.25 3.94Scaling 3.68 3.56 3.97 4.57 4.25 4.04Averaging 3.76 3.60 4.01 4.58 4.22 4.06
Level 1: States
Bottom-up 10.70 10.52 10.85 11.46 11.27 11.03OLS 11.07 10.58 11.13 11.62 12.21 11.35Scaling 10.44 10.17 10.47 10.97 10.98 10.67Averaging 10.59 10.36 10.69 11.27 11.21 10.89
Based on a rolling forecast origin with at least 12 observations in thetraining set.
Forecasting hierarchical time series Application to Australian tourism 40
Hierarchy: states, zones, regions
Forecast Horizon (h)MAPE 1 2 4 6 8 Average
Level 2: Zones
Bottom-up 14.99 14.97 14.98 15.69 15.65 15.32OLS 15.16 15.06 15.27 15.74 16.15 15.48Scaling 14.63 14.62 14.68 15.17 15.25 14.94Averaging 14.79 14.79 14.85 15.46 15.49 15.14
Bottom Level: Regions
Bottom-up 33.12 32.54 32.26 33.74 33.96 33.18OLS 35.89 33.86 34.26 36.06 37.49 35.43Scaling 31.68 31.22 31.08 32.41 32.77 31.89Averaging 32.84 32.20 32.06 33.44 34.04 32.96
Based on a rolling forecast origin with at least 12 observations in thetraining set.
Forecasting hierarchical time series Application to Australian tourism 41
Groups: Purpose, states, capital
Forecast Horizon (h)MAPE 1 2 4 6 8 Average
Top Level: Australia
Bottom-up 3.48 3.30 4.04 4.56 4.58 4.03OLS 3.80 3.64 3.94 4.22 4.35 3.95Scaling 3.65 3.45 4.00 4.52 4.57 4.04Averaging 3.59 3.33 3.99 4.56 4.58 4.04
Level 1: Purpose of travel
Bottom-up 8.14 8.37 9.02 9.39 9.52 8.95OLS 7.94 7.91 8.66 8.66 9.29 8.54Scaling 7.99 8.10 8.59 9.09 9.43 8.71Averaging 8.04 8.21 8.79 9.25 9.44 8.82
Based on a rolling forecast origin with at least 12 observations in thetraining set.
Forecasting hierarchical time series Application to Australian tourism 42
Groups: Purpose, states, capital
Forecast Horizon (h)MAPE 1 2 4 6 8 Average
Level 2: States
Bottom-up 21.34 21.75 22.39 23.26 23.31 22.58OLS 22.17 21.80 23.53 23.15 23.90 22.99Scaling 21.49 21.62 22.20 23.13 23.25 22.51Averaging 21.38 21.61 22.30 23.17 23.24 22.51
Bottom Level: Capital city versus other
Bottom-up 31.97 31.65 32.19 33.70 33.47 32.62OLS 32.31 30.92 32.41 33.35 34.13 32.55Scaling 32.12 31.36 32.18 33.36 33.43 32.52Averaging 31.92 31.39 32.04 33.51 33.39 32.49
Based on a rolling forecast origin with at least 12 observations in thetraining set.
Forecasting hierarchical time series Application to Australian tourism 43
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series hts package for R 44
hts package for R
Forecasting hierarchical time series hts package for R 45
hts: Hierarchical and grouped time seriesMethods for analysing and forecasting hierarchical and groupedtime series
Version: 3.01Depends: forecastImports: SparseMPublished: 2013-05-07Author: Rob J Hyndman, Roman A Ahmed, and Han Lin ShangMaintainer: Rob J Hyndman <Rob.Hyndman at monash.edu>License: GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]
Example using Rlibrary(hts)
# bts is a matrix containing the bottom level time series# g describes the grouping/hierarchical structurey <- hts(bts, g=c(1,1,2,2))
Forecasting hierarchical time series hts package for R 46
Example using Rlibrary(hts)
# bts is a matrix containing the bottom level time series# g describes the grouping/hierarchical structurey <- hts(bts, g=c(1,1,2,2))
Forecasting hierarchical time series hts package for R 46
Total
A
AX AY
B
BX BY
Example using Rlibrary(hts)
# bts is a matrix containing the bottom level time series# g describes the grouping/hierarchical structurey <- hts(bts, g=c(1,1,2,2))
# Forecast 10-step-ahead using optimal combination method# ETS used for each series by defaultfc <- forecast(y, h=10)
Forecasting hierarchical time series hts package for R 47
Example using Rlibrary(hts)
# bts is a matrix containing the bottom level time series# g describes the grouping/hierarchical structurey <- hts(bts, g=c(1,1,2,2))
# Forecast 10-step-ahead using OLS combination method# ETS used for each series by defaultfc <- forecast(y, h=10)
# Select your own methodsally <- allts(y)allf <- matrix(, nrow=10, ncol=ncol(ally))for(i in 1:ncol(ally))
allf[,i] <- mymethod(ally[,i], h=10)allf <- ts(allf, start=2004)# Reconcile forecasts so they add upfc2 <- combinef(allf, Smatrix(y))
Forecasting hierarchical time series hts package for R 48
hts functionUsagehts(y, g)gts(y, g, hierarchical=FALSE)
Argumentsy Multivariate time series containing the bot-
tom level seriesg Group matrix indicating the group structure,
with one column for each series when com-pletely disaggregated, and one row for eachgrouping of the time series.
hierarchical Indicates if the grouping matrix should betreated as hierarchical.
Detailshts is simply a wrapper for gts(y,g,TRUE). Both return anobject of class gts.
Forecasting hierarchical time series hts package for R 49
forecast.gts functionUsageforecast(object, h,method = c("comb", "bu", "mo", "tdgsf", "tdgsa", "tdfp", "all"),fmethod = c("ets", "rw", "arima"), level, positive = FALSE,xreg = NULL, newxreg = NULL, ...)
Argumentsobject Hierarchical time series object of class gts.h Forecast horizonmethod Method for distributing forecasts within the hierarchy.fmethod Forecasting method to uselevel Level used for "middle-out" method (when method="mo")positive If TRUE, forecasts are forced to be strictly positivexreg When fmethod = "arima", a vector or matrix of external re-
gressors, which must have the same number of rows as theoriginal univariate time series
newxreg When fmethod = "arima", a vector or matrix of external re-gressors, which must have the same number of rows as theoriginal univariate time series
... Other arguments passing to ets or auto.arima
Forecasting hierarchical time series hts package for R 50
Utility functions
allts(y) Returns all series in thehierarchy
Smatrix(y) Returns the summing matrix
combinef(f) Combines initial forecastsoptimally.
Forecasting hierarchical time series hts package for R 51
More information
Forecasting hierarchical time series hts package for R 52
Vignette on CRAN
Outline
1 Hierarchical time series
2 Forecasting framework
3 Optimal forecasts
4 Approximately optimal forecasts
5 Application to Australian tourism
6 hts package for R
7 References
Forecasting hierarchical time series References 53
References
RJ Hyndman, RA Ahmed, G Athanasopoulos, andHL Shang (2011). “Optimal combinationforecasts for hierarchical time series”.Computational Statistics and Data Analysis55(9), 2579–2589
RJ Hyndman, RA Ahmed, and HL Shang (2013).hts: Hierarchical time series.cran.r-project.org/package=hts.
RJ Hyndman and G Athanasopoulos (2013).Forecasting: principles and practice. OTexts.OTexts.org/fpp/.
Forecasting hierarchical time series References 54
References
RJ Hyndman, RA Ahmed, G Athanasopoulos, andHL Shang (2011). “Optimal combinationforecasts for hierarchical time series”.Computational Statistics and Data Analysis55(9), 2579–2589
RJ Hyndman, RA Ahmed, and HL Shang (2013).hts: Hierarchical time series.cran.r-project.org/package=hts.
RJ Hyndman and G Athanasopoulos (2013).Forecasting: principles and practice. OTexts.OTexts.org/fpp/.
Forecasting hierarchical time series References 54
å Papers and R code:
robjhyndman.com
å Email: [email protected]