stiff: a forecasting framework for spatio-temporal data zhigang li, margaret h. dunham department of...

24
STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University (Abbreviated Version from PAKDD ’02)

Upload: felicia-mckinney

Post on 28-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

STIFF: A Forecasting Framework for

Spatio-Temporal Data

Zhigang Li, Margaret H. Dunham

Department of Computer Science and Engineering

Southern Methodist University

(Abbreviated Version from PAKDD ’02)

Page 2: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 2

Our goal

In this paper, we present a novel forecasting framework for spatio-temporal data, in which not only spatial but also temporal characteristics of the data are considered to obtain a more appropriate result.

Page 3: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 3

Presentation Outline

MotivationOur Approach: STIFF

Combining two approaches to achieve better results: Time Series Analysis and ANNs

PerformanceFuture Work

Page 4: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 4

Why There are many application fields which require

spatio-temporal forecasting:river hydrology, biological patterns, housing

price research, rainfall distribution, waste monitoring, fishery, hotel pickup rate, etc.

In spatio-temporal forecasting, both spatial and temporal properties, as well as their mutual correlation, are taken into account.

Page 5: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 5

Flood Forecasting (Our Motivating Application)

Catchment Many different types

of sensors Predict at one sensor

location Water level or Flow

rate May not be interested

in actual prediction of value

Page 6: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 6

Our approach : Problem definition

Δ={α0, α1, α2, … αn} is the research field, composed of n + 1 spatially separated subcomponents, named by αi accordingly.

WLOG, α0 is assumed the target place where forecasting is about to be carried out.

For each αi in Δ, there are j observations with equal time intervals between consecutive ones, denoted by Лi={αi1, αi2, αi3, … αij}.

Page 7: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 7

Problem definition (Cont.)

- Given Δ={α0, α1, α2, … αn}, Л={Л1, Л2, …Лn}, the length of observations j and the look-ahead steps of ι, we are expected to find an as good as possible forecasting relationship ƒ that is defined as follows.

Page 8: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 8

Our approach : Algorithm sketch

1) Describe the forecasting problem according the problem definition.

Build a time series (ARIMA) model for each αi. Name the forecasting from α0 time series model as ƒT.

- Construct and train an ANN to capture the spatial correlation and influence over the target subcomponent α0. Name the forecasting from the neural network as ƒS.

- Combine ƒT and ƒS via a statistical regression mechanism.

Page 9: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 9

Find the spatial influence

Normally it is much harder to find than its temporal counterpart in the problem.

No precise way to convert from the spatial measurement to the value it may change.

Time is only 1 dimension while space is 3 (or 2) dimensions.

A simple “distance” measure is not enough, other factors are important.

Page 10: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 10

Artificial Neural Network (ANN) Why is ANN used for finding spatial influence? Itself a “black-box” and non-linear technology

used to find the hidden pattern. Like human brain, it can self-adjust and learn

automatically even if the problem is not defined very well.

Practice proves its usefulness[See,1997] found ANN was especially useful in

“… situations where the underlying physical relationships are not fully understood …”

Page 11: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 11

ANN Construction Simple 3-layer back-propagation MLP One input node for each sensor value except α0.

Actual input shifted by predicted time lag. The hidden layer has a certain number of neurons

that have to be decided by experiment. The output layer has only one neuron that

corresponds to the target subcomponent α0.

We also employ a kind of pruning strategy to achieve the most simplicity of ANN structure without harming the efficacy much.

Page 12: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 12

Integrate the two forecasts We have two forecasts so far at the target

subcomponent α0. One is ƒT, from the time series model, and the other is ƒS, from ANN. We may- Either dynamically select one from the two as the

current forecast;- Or fuse them together since they contribute to the

overall forecasting from two different aspects. (That’s what we take in the paper.)

The two forecasts are integrated via a very simple linear regression mechanism. Of course other more advanced alternatives can be used instead for better results.

Page 13: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 13

A case study (National River Flow Archive – Great Britain)

Here we are going to present a practical case study to demonstrate how the framework works.

We will conduct the spatio-temporal forecasting at the outlet gauging station 28010 regarding the river water flow rate (m3/s). The basin is shown as follows.

The target station is 28010 while its siblings are lying upstream.

Derwent Catchment

Daily mean flow values

Page 14: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 14

Data transformation

Checking the water flow rate data at station 28010 tells us the data is not very stable. The abrupt change is obvious and present roughly about 25% of the whole time.

We therefore employ the data transformation first according to the proposed approach discussed before .

We empirically vary the value of λ from –1.0 to 1.0 with the step of .1. It turns out λ = 0.0 is the best (relatively). In other words, we will log-transform the original water flow rate data.

Page 15: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 15

Actual Flow at Derwent

Page 16: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 16

Case Study ANN

6 input nodes1 output node6 chosen as number of hidden nodes based

on experimentationNumber of links pruned based on river

topologyLag time used for input based on expected

flow lag time

Page 17: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 17

Building models Following the framework specification, we then build a

time series model based upon the dataset collected from each gauging station.

An ANN is constructed after that, with the spatially-induced pruning strategy applied to erase as many as possible unnecessary links while sacrificing little to the forecasting accuracy.

The final overall spatio-temporal forecasting is generated then following this simple regression:

Page 18: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 18

STIFF Model

702343115548

fS

fT

x1 fT + x2 fS + C

Page 19: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 19

Performance Analysis

Compared STIFF to pure time series (CTS) and pure ANN (CANN)

Data starting at 10/01/7530, 60, 120 daysNormalized Absolute Ratio Error (NARE)

Page 20: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 20

Forecasting result The forecasting comparison result, measured in NARE, is

outlined in the following table. The other two models, built to our best knowledge, are used to compare with STIFF.

Here “Over” means overestimation while “Under” for underestimation.

Page 21: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 21

Result 30 Days

Page 22: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 22

Conclusion

STIFF has a better forecast accuracy than the normal single time series model and ANN model, and more balanced (over vs. under estimation).

Compared with other related work, it avoids the oversimplification.

Does not have the large variation problem. STIFF requires much human intervention and

interpretation. STIFF is promising for future research.

Page 23: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 23

Future work

Extend to multivariate forecasting Use more sophisticated fusing techniques Test on more flood data Compare to other techniques Examine different ANN structures So far, it can only deal with univariate forecasting. Extend to other application domains …..

Page 24: STIFF: A Forecasting Framework for Spatio-Temporal Data Zhigang Li, Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist

May 6, 2002 Li & Dunham, PAKDD 24