on discovering moving clusters in spatio-temporal data panos kalnis national university of singapore...

21
On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras Hong Kong University of Science and Technology

Post on 20-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

On Discovering Moving Clusters in Spatio-temporal Data

Panos KalnisNational University of Singapore

Nikos MamoulisUniversity of Hong Kong

Spiridon BakirasHong Kong University of Science and Technology

Page 2: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

What is a Moving Cluster? Dense clusters of objects that move

similarly for a long time period Not necessarily the same objects during

the lifetime of the cluster Examples

Migrating animals Convoy of cars Military applications

Solutions: Efficient exact and approximate algorithms

Page 3: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Problem Formulation

Example: Moving cluster

1

1

ii

ii

cc

cc

5.0321 ccc

6

3

21

21

cc

cc

5

4

32

32

cc

cc

Page 4: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Related Work (Static) Partition-based clustering (k-medoids) Hierarchical clustering (BIRCH, CURE) Density-based clustering (DBSCAN)

ε

ε

MinPts=3

Page 5: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Related Work (Moving Objects) Grouping trajectories [Vlachos et.al, ICDE 02]

Trajectory cluster: Constant set of objects through its lifetime

Only similar movement; no space proximity Dense areas over time [Hadjieleftheriou et.al, SSTD 03]

Static dense regions No common objects between regions in sequence

Incremental DBSCAN/OPTICS [Ester et.al, VLDB 98]

Only a small percentage of objects moves Maintaining Data Bubbles [Nassar et.al, SIGMOD 04]

Redistributes updated objects in existing bubbles

Page 6: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

MC1: The Straight-forward approach

G: set of moving clusters Apply clustering to next

timeslice Si

Expand moving clusters in G Add new moving clusters to G Report ending clusters

Page 7: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Hash-based DBSCAN

2

2

Memory:

10M objects with 1GB RAM

2||||

2gSO i

Page 8: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

MC1 is inefficient!

1. Checks all possible combination of clusters in consecutive timeslices

2. Performs clustering for every timeslice

Page 9: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

MC2: Minimizing Redundant Checks

Clustering in every timeslice

Select a random object in c1

Search the object in S2

Repeat for remaining objects

Max: (1-θ)|ci| objects

c1c2 is a moving cluster

Page 10: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Ambiguity Cases: θ<0.5

3

1 {c0c1, c2}

{c0c2, c1}

Page 11: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

MC3: Approximate Moving Clusters Intuition: Many clusters will remain the

same even if objects move Avoid performing clustering in every

timeslice For an object o

If o belongs to cluster c in timeslice Si

Assume that o also belongs to c in the next timeslice (notice: objects may have moved)

Page 12: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Refine clusters Hash new clusters in a grid Legal cluster:

Does not meet/intersect with other clusters

It is connected (cells meet) Objects in legal clusters are

not considered further For the rest of the objects,

perform clustering Possible inaccuracies!!!

Page 13: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Minimize Error

Perform exact clustering to absorb (may not eliminate) the accumulated error

Period for exact clustering: Grows linearly, drops exponentially

Exact clustering: If more that α|G| clusters have been added/removed

Page 14: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Experimental Evaluation 10K-50K objects per timeslice 50-100 timeslices, up to 5M

objects Linux, C++, 1.3GHz CPU,

1.2GB RAM Generator: Clusters

move/rotate, objects appear/disappear

recallprecision

recallprecisionF

2

Page 15: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Varying data size (10K-50K per timeslice)

Avg: 87%

θ=0.9, α=0.1 Larger dataset: larger clusters, more interactions

Page 16: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Varying number of clusters (100-800 per timeslice)

5M objects, θ=0.9, α=0.1 Many clusters: Reaches error threshold fast

96%

87% 73%

Page 17: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Varying α

5M objects, θ=0.9, 800 clusters α small: may not recover!!!

Page 18: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Varying α for different agilities

Low agility: Fewer errors faster

Page 19: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

MC3 for varying θ

5M objects, α=0.1, 800 clusters θ large: incorrect clusters are pruned for not

satisfying the θ criterion

Page 20: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Conclusions Moving clusters

Objects may move/change Exact and approximate solutions

Future work Automatic setting of parameter α Better error estimation Constraints (e.g, moving cluster must span at

least k timeslices)

Page 21: On Discovering Moving Clusters in Spatio-temporal Data Panos Kalnis National University of Singapore Nikos Mamoulis University of Hong Kong Spiridon Bakiras

Questions?