approximate querying about the past, the present, and the future in spatio-temporal databases
DESCRIPTION
Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases. Jimeng Sun, Dimitris Papadias, Yufei Tao, Bin Liu. Motivation. Spatio-temporal databases vs. Data streams The monitoring applications Traffic supervision Mobile users monitoring - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/1.jpg)
Approximate querying about the Past, the Present, and the Future
in Spatio-Temporal Databases
Jimeng Sun, Dimitris Papadias,
Yufei Tao, Bin Liu
![Page 2: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/2.jpg)
2
Motivation
• Spatio-temporal databases vs. Data streams• The monitoring applications
– Traffic supervision
– Mobile users monitoring
– Weather forecasting
• Example: – find the number of vehicles
in the city center now
• The challenge is to provide fast query response in highly intensive environment
![Page 3: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/3.jpg)
3
Problems and methods
• Problems:– How to efficiently store/summarize the spatio-temporal
information?
– How to approximately answer the query about the past, the present, and the future?
• Methods:– Adaptive multi-dimensional histogram (AMH)
– Historical synopsis
– Stochastic prediction method
![Page 4: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/4.jpg)
4
Related work
• Histograms– Static multi-dimensional histograms
• Equi-depth, Mhist, Minskew, Genhist, SQ
– Query-adaptive multi-dimensional histograms• STGrid, STHoles, SASH
• Other approximation methods– DCT, Wavelet, Sketch
• Spatio-temporal databases– Historical retrieval
– Future prediction
![Page 5: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/5.jpg)
5
Outline
• Introduction• Problem and proposed methods
– Adaptive multi-dimensional histogram
– Historical synopsis
– Prediction model
• Experiment • Conclusion
![Page 6: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/6.jpg)
6
Query types
Present Time (PT)
Historical Time (HT)
Future Time (FT)
Queries
time
location
currentpast future
![Page 7: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/7.jpg)
7
System Overview
PT
HT
FT
Queries
AMH
Past Index
Historical Synopsis
PredictionModel
Spatio-temporalupdates
![Page 8: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/8.jpg)
8
Histogram
• Partition the space into buckets• Data within a bucket summarize by
the mean• The properties of a good histogram:
– Uniformity within each bucket
– Incremental updateable
0
20
40
60
80
100
0 20 40 60 80 100
0
20
40
60
80
100
0 20 40 60 80 100
bad
good
![Page 9: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/9.jpg)
9
Adaptive Multi-dimensional Histogram (AMH)
Regular cells
1 1 3 3 3 5
446312
1 1 5 3 4 5
5
4
5
9
111165
4 5
5 6
4
10
6
9
• Objective: minimize WVS=(areai∙vari) (Minskew [Acharya, Poosala, Ramaswamy 99])
n1
n2 n3
n4
b1 b2
b4b3
b5
n5 b6
BPT
b1
b2
b3
b4
b6
b5
Buckets
![Page 10: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/10.jpg)
10
Dynamic Maintenance of AMH
• Our scheme: record the information during the construction and modify the structure as needed.– 1. information update
• Update the bucket count
– 2. bucket reorganization• Merge: to claim buckets
• Split: to reduce WVS
![Page 11: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/11.jpg)
11
Information update of AMH
n1
n2 n3
n4
b1 b2
b4b3
b5
n5 b6
BPT
b1
b2
b3
b4
b6
b5
Buckets
mappingb1
b1
n2
n1
![Page 12: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/12.jpg)
12
Bucket reorganization -Merge
n1
n2 n3
b1 b2
b5
BPT
n1
n2 n3
n4
b1 b2
b4b3
b5
n5 b6
BPT
n1
n2 n3
n4
b1 b2
b4b3
b5
n5 b6
n4
b*
Merge
b1
b2
b*
b5
Buckets
Bucket Info:1. region [x-, x+][y-,y+]2. frequency: count/area3. 2nd moment:(for variance calculation)
•Merge the subtree that leads to minimal WVS increase
![Page 13: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/13.jpg)
13
Bucket reorganization -Split
n1
n2 n3
b1 b2
b5b*
Split
n1
n2 n3
b*1
b2b5b*
b*2
n4
b*3 b*4
n5
• Split the bucket that leads to maximal WVS decrease
![Page 14: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/14.jpg)
14
Features of AMH
• Bucket information is updated as new data arrive• Bucket extents continuously adapt the data
distribution changes• The maintenance does not affect the normal query
processing– It is interruptible at any moment of time
– It is performed at the CPU idle time
![Page 15: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/15.jpg)
15
Outline
• Introduction• Problem and proposed methods
– Adaptive multi-dimensional histogram
– Historical synopsis
– Prediction model
• Experiment • Conclusion
![Page 16: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/16.jpg)
16
Historical Synopsis
• AMH maintains the current buckets.
• Past index stores the obsolete buckets.
• Past index: – Packed B-tree
– 3D R-tree
AMH
current bucketsrecent buckets
....
Past Index T
old buckets
....
main memorydisk
current cells
incoming streams
![Page 17: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/17.jpg)
17
Prediction Model
• Prediction based on velocity doesn’t work!– It is not realistic to assume velocity remains constant
between current time and query time
– Velocity is highly dynamic
• We suggest to use only the past and present location information to do prediction.
![Page 18: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/18.jpg)
18
Prediction Model (cont.)
FT
PredictionModel
HT
PT
Historical Synopsis
results
Parse
forecast the future using any time series prediction method: we use AR
0
2
4
6
8
10
0 10 20 30 40 50 60 70 80
![Page 19: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/19.jpg)
19
Outline
• Introduction• Related work• Problem and proposed methods
– Adaptive multi-dimensional histogram
– Historical synopsis
– Prediction model
• Experiment • Conclusion
![Page 20: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/20.jpg)
20
Experiment settings
• Datasets– 2.5M updates for each dataset
– spatial: 50K mobile objects from 2 spatial dataset
– road: from a spatio-temporal generator (described in [Brinkhoff 2002] )
median finalinitial
Road network Data distribution
![Page 21: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/21.jpg)
21
Robustness with time
0.5M 1M 1.5M 2M 2.5Mnumber of location updates
error rate
0
4%
8%
12%
16%
5k
number of location updates0
10%
20%
30%
0.5M 1M 1.5M 2M 2.5M5k
spatial
road
Query: qlength = 6% of the data space; 25K queries uniformly distribute along space and time
![Page 22: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/22.jpg)
22
Comparison with conventional histogram
• Minskew (a static spatial histogram) is rebuilt every 50k location updates
• tp is the proportion between the cost of AMH and that of Minskew
• The re-organization operations of AMH are uniformly distributed among the 50k location updates.
error rate
0
10%
20%
30%
0.001 0.01 0.1 1
time proportion
error rate
10%
15%
20%
25%
30%
0.001 0.01 0.1 1
spatial
road
minskew
AMH
minskew
AMH
![Page 23: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/23.jpg)
23
The effect of update intensity
• B-tree performs better at the high update rate.
• R-tree provides much faster query response.
• In general, when query/update ratio is large (>30%), R-tree performs better.
CPU timemsec
0
1
2
3
4
5
PT HT FT
error rate
0%
5%
10%
15%
20%
25%
1k 10k 100kupdate rate update rate
error rate
0%
5%
10%
15%
20%
25%
1k 10k 100k
spatialroad
3D r-tree b-treeQuery type
![Page 24: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/24.jpg)
24
Conclusion
• We present a comprehensive approach for processing queries that refer to any time in history.
• The proposed architecture maintains– an incremental multi-dimensional histogram;
– a past index structure for storing the outdated buckets.
• Future queries are answered by a stochastic method that uses the recent history to predict the future.
![Page 25: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/25.jpg)
25
Q+A
![Page 26: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/26.jpg)
26
Summary
AMH
Past Index
Historical Synopsis
PredictionModel
0. goal: min(WVS)1. Info update2. Reorganization happens when CPU is idle
1.Recent buckets in memory2.Old buckets dump to the disk
Old
buc
kets
Forecast based on the present and past.
![Page 27: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/27.jpg)
27
Related work
• Static multi-dimensional histograms• Query-adaptive multi-dimensional histograms• Other multi-dimensional approximation methods• Spatio-temporal prediction methods• Spatio-temporal aggregation methods
![Page 28: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/28.jpg)
28
Evaluation over different query typeserror rate
q
0%
5%
10%
15%
20%
2% 4% 6% 8% 10%L
q
error rate
0%
5%
10%
15%
20%
25%
30%
35%
2% 4% 6% 8% 10%L
spatial
road
![Page 29: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/29.jpg)
29
Motivation (cont.)
• Spatio-temporal database (STDB) research:– historical retrieval
– future prediction
![Page 30: Approximate querying about the Past, the Present, and the Future in Spatio-Temporal Databases](https://reader035.vdocuments.site/reader035/viewer/2022062422/56813d11550346895da6cd74/html5/thumbnails/30.jpg)
30
Bucket reorganization -Split
n1
n2 n3
b1 b2
b5b*
b1
b2
b*
b5
BucketsSplit
b*1
b2
b*
b5
Buckets
n1
n2 n3
b*1
b2b5b*
b*2
n4
b*2
b*3 b*4
n5
b*3
b*4