clustering trajectories of moving objects in an uncertain world

44
Clustering Trajectories of Moving Objects in an Uncertain World 1 Dept. of Informatics, Univ. of Piraeus, Greece 2 Tech. Educational Institute of Crete, Greece Nikos Pelekis 1 , Ioannis Kopanakis 2 , Evangelos E. Kotsifakos 1 , Elias Frentzos 1 , Yannis Theodoridis 1 IEEE International Conference on Data Mining (ICDM 2009), Miami, FL, USA, 6-9 December, 2009

Upload: janus

Post on 26-Jan-2016

42 views

Category:

Documents


1 download

DESCRIPTION

Clustering Trajectories of Moving Objects in an Uncertain World. Nikos Pelekis 1 , Ioannis Kopanakis 2 , Evangelos E. Kotsifakos 1 , Elias Frentzos 1 , Yannis Theodoridis 1. IEEE International Conference on Data Mining (ICDM 2009), Miami, FL, USA, 6-9 December, 2009. Outline. Related work - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Clustering Trajectories of Moving Objects in an Uncertain World

Clustering Trajectories of Moving Objects in an Uncertain World

1 Dept. of Informatics, Univ. of Piraeus, Greece

2 Tech. Educational Institute of Crete,

Greece

Nikos Pelekis1, Ioannis Kopanakis2, Evangelos E. Kotsifakos1,

Elias Frentzos1, Yannis Theodoridis1

IEEE International Conference on Data Mining (ICDM 2009), Miami, FL, USA, 6-9 December, 2009

Page 2: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 2

Outline

Related work Motivation Our contribution

From Trajectories to Intuitionistic Fuzzy Sets A similarity metric for Uncertain Trajectories (Un-Tra) Cen-Tra: The Centroid Trajectory of a bunch of trajectories TR-I-FCM: A novel clustering algorithm for Un-Tra

Experimental study Conclusions & future work

Page 3: Clustering Trajectories of Moving Objects in an Uncertain World

Related Workon Mobility Data Mining

Trajectory clustering

Page 4: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 4

Trajectory Clustering

Questions: Which distance between trajectories? Which kind of clustering? What is a cluster ‘mean’ or ‘centroid’?

A representative trajectory?

Page 5: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 5

Average Euclidean distance

““Synchronized” behaviour distanceSynchronized” behaviour distance Similar objects = almost always in the same place at the same time

Computed on the whole trajectoryComputed on the whole trajectory Computational aspects:Computational aspects:

Cost = O( |11| + |22| ) (|| = number of points in ) It is a metric => efficient indexing methods allowed, e.g. [Frentzos et al. 2007]

Timeseries-based approaches: LCSS, DTW, ERP, EDR Trajectory-oriented approach:

(time-relaxed) route similarity vs. (time-aware) trajectory similarity and variations (speed-pattern based similarity; directional similarity; …) [Pelekis et al. 2007]

Which distance?

distance between moving

objects 1 and 2 at time t||

))(),((|),(

21

21T

dtttdD T

T

Page 6: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 6

K-means

T-OPTICS [Nanni

& Pedreschi,

2006]

HAC-average

Which kind of clustering?

Reachability plot (= objects reordering for distance distribution)

threshold

Page 7: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 7

[Lee et al. 2007] Discovers similar portions of

trajectories (sub-trajectories)

Two phases: partitioning and grouping

TRACLUS: A Partition-and-Group Framework

Page 8: Clustering Trajectories of Moving Objects in an Uncertain World

What about usage of Mobility Patterns?

Visual analytics for mobility data

Page 9: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 9

Visual analytics for mobility data

[Andrienko et al. 2007] What is an appropriate way to visualize groups of trajectories?

Page 10: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 10

Summarizing a bunch of trajectories

1) Trajectories sequences of “moves” between “places”

2) For each pair of “places”, compute the number of “moves”

3) Represent “moves” by arrows (with proportional widths)

Major flow

Minor variations

Many

small

moves

Page 11: Clustering Trajectories of Moving Objects in an Uncertain World

Coming back to our approach

Page 12: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 12

Challenge 1: Introduce trajectory fuzziness in spatial clustering techniques

The application of spatial clustering algorithms (k-means, BIRCH, DBSCAN,

STING, …) to Trajectory Databases (TD) is not straightforward

Fuzzy clustering algorithms (Fuzzy C-Means and its variants) quantify the degree

of membership of each data vector to a cluster

The inherent uncertainty in TD should taken into account.

Challenge 2: study the nature of the centroid / mean / representative

trajectory in a cluster of trajectories.

Is it a ‘trajectory’ itself?

Motivation

Page 13: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 13

I-Un-Tra: An intuitionistic fuzzy vector representation of trajectories enables clustering of trajectories by existing (fuzzy or not) clustering

algorithms

DUnTra: A distance metric of uncertain trajectories Cen-Tra: The centroid of a bunch of trajectories

using density and local similarity properties TR-I-FCM: A novel modification of FCM algorithm for clustering

complex trajectory datasets exploiting on DUnTra and Cen-Tra.

Our contribution

Page 14: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 14

From Fuzzy sets to Intuitionistic fuzzy sets

Definition 1 (Zadeh, 1965). Let a set E be fixed. A fuzzy set on E is an object of the form

Definition 2 (Atanassov, 1986; Atanassov, 1994). An intuitionistic fuzzy set (IFS) A is an object of the form

, ( )AA x x x E

: [0,1]A E where

, ( ), ( )A AA x x x x E

: [0,1]A E : [0,1] A Eandwhere

Page 15: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 15

Hesitancy

For every element

The hesitancy of the element x to the set A is

Ex

0 ( ) 1A x

0 ( ) 1 A x

0 ( ) ( ) 1A Ax x

( ) 1 ( ) ( )A A Ax x x

Page 16: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 16

Vector representation of trajectories

Assume a regular grid G(m n) consisting of cells ck,l , a trajectory

and a target dimension p << ni,

The “approximate trajectory”

consists of p regions (i.e. sets of cells) crossed by Ti during period pj

The “Uncertain Trajectory” is the ε-buffer of

i i ii i,0 i,0 i,0 i,n i,n i,nT = <(x , y , t ), ..., (x , y , t )>

i i,1 i,pT = <r , ..., r >

1,

ls j ls jj p pp

i i,1 i,pUnTra(T ) = <ur , ..., ur >

Page 17: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 17

Intuitionistic Uncertain Trajectories

membership = inside cell with 100% probability (i.e. thick portions) non-membership = outside cell with 100% probability (i.e. dotted portions) hesitancy = ignorance whether inside or outside the cell (i.e. solid thin portions)

A cell ck.l

ck.l ε

Page 18: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 18

Intuitionistic Uncertain Trajectories

i i,1 A i,1 A i,1 i,p A i,p A i,pI-UnTra(T ) = <(ur , (ur ), (ur )) ..., (ur , (ur ), (ur ))>

, ,( )A i j i j iur r UnTra T

,

,( )i i j

A i j

i

UnTra T urur

UnTra T

( ) j j

j

i i

A i

i

ur rur

UnTra T

Page 19: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 19

Proposed similarity metric (1/2)

The distance between two I-UnTra A and B is:

where

and

( , ) ( , ) ( , ) 2UnTra

total UnTra IFSIFSD A B A B D A B D A B

,1 ,1

,1

,1

, min

, ,

, ,

,

UnTra i j

UnTra i j i j ext

UnTra i j i ext

UnTra i j j ext

D UnTra T UnTra T

D Rst UnTra T Rst UnTra T ur ur

D Rst UnTra T UnTra T ur gap

D UnTra T Rst UnTra T ur gap

211

2

2

x i x j

x i j

i j exty i y j

y i j

ext mbr ur ext mbr ur

ext mbr ur urur ur

ext mbr ur ext mbr ur

ext mbr ur ur

x iext mbr ur

y iext mbr ur y i jext mbr ur ur

x i jext mbr ur ur

Page 20: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 20

Proposed similarity metric (2/2)

Assuming two intuitionistic fuzzy sets on it, A = (MA, ΓA, ΠA) and B = (MB, ΓB, ΠΒ), with the same cardinality n, the similarity measure Z between A and B is given by the following equation:

where z(A’,B’) for fuzzy sets A' and B' (e.g. for MA, MB) is defined as:

and similarly for ΓA, ΓB and ΠA, ΠB.

13, , , ,A B A B A BZ A B z M M z z

' '1

' '1

min ,, ' '

', ' max ,

1, ' '

n

A i B iin

A i B ii

x xA B

z A B x x

A B

Page 21: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 21

The Centroid Trajectory

The idea (similarity-density-based approach): adopt some local similarity function to identify common sub-trajectories

(concurrent existence in space-time), follow a region growing approach according to density

Page 22: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 22

T1 T2 T3

Algorithm CenTra: An example

Page 23: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 23T1 T2 T3

The Centroid Trajectory

Page 24: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 24

The FCM objective function:

Given that to be minimized requires:

and

Fuzzy C-Means algorithm

c

i

N

kik

mikm duVUJ

1 1

2,

, , ,1

,0

, ,

1

1

2

1

2

1

k

Iikik

k

kc

j

mjk

mik

ik

Nkici

IIiu

Ii

I

d

d

u

k

.

1

1

1

N

k

mik

k

N

k

mik

ici

u

xu

v

N

kik

c

iik Nuu

11

0 ,1

1. Determine c (1 < c < N), and initialize V(0), j=1, 2. Calculate the membership matrix U(j), 3. Update the centroids’ matrix V(j), 4. If |U(j+1)-U(j)|>ε then j=j+1 and go to Step 2.

Page 25: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 25

Ignore update centroid step and instead use CenTra

The FCM objective function:

Given that to be minimized requires :

and

CenTR-I-FCM algorithm

N

kik

c

iik Nuu

11

0 ,1

1. V(0) = c random I-UnTra; j=1; 2. repeat 3. Calculate membership matrix U(j) 4. Update the centroids’ matrix V(j) using CenTra; 5. Compute membership and non-membership degrees of V(j) 6. Until ||Uj+1-Uj||F≤ε; j=j+1;

1 1

,c N

UnTramCenTR I FCMm ik k i IFS

i k

J U V u x v

11

11

1

1

, ,

0, , ,1,

k

c UnTraUnTra mmk i k j kIFS IFS

j

iki c ki k N

kik k

i I

x v x v I

u i IIu i I

.

1

1

1

N

k

mik

k

N

k

mik

ici

u

xu

v

Page 26: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 26

Experiments (1/2)

Dataset: ’Athens trucks’ MOD (www.rtreeportal.org) 50 trucks, 1100 trajectories, 112.300 position records

Page 27: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 27

Experiments (2/2)

Use CommonGIS [Andrienko et al., 2007] to identify real clusters

“Round trips” clusters “Linear” clusters

Page 28: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 28

Results (Clustering accuracy scaling cell size, ε )

0.80% 1.00% 1.33% 2.00% 4.00% 6.67% 01

23

0

10

20

30

40

50

60

70

80

90

100Su

cces

s

Cell Sizeε

Fix density threshold to δ=2% of the total number of trajectories

Page 29: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 29

Results (Clustering accuracy scaling density threshold, δ)

0.80%1.00%

1.33%2.00%

4.00%6.67%

0.020.04

0.06

0

10

20

30

40

50

60

70

80

90

100S

ucc

ess

Cell Sizeδ

Fix uncertainty to ε=1

Page 30: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 30

Results (scaling the number of clusters)

0

10

20

30

40

50

60

70

80

90

100

2 3 4

# Clusters

Succ

ess

CenTR-I-FCM

TR-FCM

Page 31: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 31

Results (scaling the dataset cardinality)

0

2

4

6

8

10

12

14

16

18

20

0 200 400 600 800 1000 1200

# Trajectories

Exe

cuti

on ti

me

(sec

)

2 clusters

3 clusters

4 clusters

5 clusters

Page 32: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 32

Results (Quality of CenTra)

Representative Trajectories vs. Centroid Trajectories

cell size=1.3%, ε=0, δ=0.09cell size=1.3%, ε=0, δ=0.09, cell size=2.8%, ε=0, δ=0.02

Page 33: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 33

Conclusions

We proposed a three-step approach for clustering trajectories of moving objects, motivated by the observation that clustering and representation issues in TD are inherently subject to uncertainty. 1st step: an intuitionistic fuzzy vector representation of trajectories plus a

distance metric consisting of a metric for sequences of regions and a metric for intuitionistic fuzzy sets

2nd step: Algorithm CenTra, a novel technique for discovering the centroid of a bundle of trajectories

3rd step: Algorithm CenTR-I-FCM, for clustering trajectories under uncertainty

Page 34: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 34

Future Work (@ ICDM time)

Exploit the metric properties of the proposed distance function by using an distance-based index structure (for efficiency purposes);

Perform extensive experimental evaluation using large trajectory datasets

Devise a clever sampling technique for multi-dimensional data so as to diminish the effect of initialization in the algorithm;

The last two have already been done in N. Pelekis, I. Kopanakis, E. Kotsifakos, E. Frentzos and Y. Theodoridis. “Clustering

Uncertain Trajectories”, Knowledge and Information Systems (KAIS), to appear. N. Pelekis, I. Kopanakis, C. Panagiotakis and Y. Theodoridis. “Unsupervised

Trajectory Sampling”, In the ECML PKDD 2010, Barcelona, Spain, 2010. to appear.

Page 35: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 35

Acknowledgements

Research partially supported by the FP7 ICT/FET Project MODAP (Mobility, Data Mining, and Privacy) funded by the European Union. URL: www.modap.org

a continuation of the FP6-14915 IST/FET Project GeoPKDD (Geographic Privacy-aware Knowledge Discovery and Delivery) funded by the European Union. URL: www.geopkdd.eu

Some slides are from: Fosca Giannotti, Dino Pedreschi, and Yannis Theodoridis, “Geographic

Privacy-aware Knowledge Discovery and Delivery”, EDBT Tutorial, 2009.

Page 36: Clustering Trajectories of Moving Objects in an Uncertain World

Back up slides

Page 37: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 37

Examples of mobility patterns exploitation

Trajectory Density-based queries Find hot-spots (popular places) [Giannotti et al. 2007] Find T-Patterns [Giannotti et al. 2007] Find hot motion paths [Sacharidis et al. 2008] Find typical trajectories [Lee et al. 2007] Identify flocks &

leaders [Benkert et al. 2008]

δt

ε

X

Y

T

Page 38: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 38

Which kind of clustering?

General requirements: Non-spherical clusters should be allowed

E.g.: A traffic jam along a road = “snake-shaped” cluster

Tolerance to noise Low computational cost Applicability to complex, possibly non-vectorial data

Page 39: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 39

Temporal focusing

Different time intervals can show different behaviours E.g.: objects that are close to each other within a time interval can be

much distant in other periods of time

The time interval becomes a parameter E.g.: rush hours vs. low traffic times

Already supported by the distance measure Just compute D(1 1 , , 22) |T on a time interval T’ T

Problem: significant T’ are not always known a priori An automated mechanism is needed to find them

Page 40: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 40

The representative trajectory of the cluster: Compute the average direction vector and rotate the axes temporarily . Sort the starting and ending points by the coordinate of the rotated axis. While scanning the starting and ending points in the sorted order, count the

number of line segments and compute the average coordinate of those line segments.

TRACLUS – representative trajectory

Page 41: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 41

Handling Uncertainty

Handling uncertainty is a relatively new topic!

A lot of research effort has been assigned Developing models for

representing uncertainty in trajectories. The most popular one [Trajcevski et al. 2004]: a trajectory of an object is

modeled as a 3D cylindrical volume around the tracked trajectory (polyline)

Various degrees of uncertainty

Page 42: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 4242

Trajectory Uncertainty vs. Anonymization

Never Walk Alone [Bonchi et al. 2008] Trade uncertainty for anonymity: trajectories that are close up the

uncertainty threshold are indistinguishable Combine k-anonymity and perturbation

Two steps: Cluster trajectories into

groups of k similar ones (removing outliers)

Perturb trajectories in a cluster so that each one is close to each other up to the uncertainty threshold

Page 43: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 43

An example

A={x, 0.4, 0.2}, B={x, 0.5, 0.3}, C={x, 0.5, 0.2}

C is more similar to A than B

, , ( )A B C IFSs E

0.4 0.2 0.20.5 0.3 0.4( , ) 0.65

3Z A B

0.4 0.2 0.30.5 0.2 0.4( , ) 0.85

3Z A C

Page 44: Clustering Trajectories of Moving Objects in an Uncertain World

Pelekis et al. "Clustering Trajectories of Moving Objects in an Uncertain World" 44

Qualitative evaluation of ZNo Measure Counter-intuitive

cases Measure Values Proposed

measure value

I. SC , SDC {( ,0,0,1)},

{( ,0.5,0.5, 0)}

A x

B x

SC(A,B)=SDC(A,B)=1 Z=0

II. SH, SHB, peS

{( ,0.3,0.3,0.4)},

{( ,0.4,0.4,0.2)},

{( ,0.3,0.4,0.3)},

{( ,0.4,0.3,0.3)}

A x

B x

C x

D x

SH (A,B)=SHB(A,B)= p

eS (A,B)=0.9

SH (C,D)=SHB(C,D)= peS (C,D)=0.9

Z(A,B)=0.66 Z(C,D)=0.83

III. SH, SHB, peS

{( ,1,0,0)},

{( ,0,0,1)},

{( ,0.5,0.5,0)}

A x

B x

C x

SH (A,B)=SHB(A,B)= p

eS (A,B)=0.5

SH (B,C)=SHB(B,C)= peS (B,C)=0.5

Z(A,B)= Z(B,C)=0

IV. SL and pSS

{( ,0.4,0.2,0.4)},

{( ,0.5,0.3,0.2)},

{( ,0.5,0.2,0.3)}

A x

B x

C x

SL(A,B)= p

SS (A,B)=0.95

SL(A,C)= pSS (C,D)=0.95

Z(A,B)=0.65 Z (A,C)=0.85

V. 1 2 3, ,HY HY HYS S S

{( ,1,0,0)},

{( ,0,0,1)}

A x

B x

1 2 3( , ) ( , ) ( , ) 0HY HY HYS A B S A B S A B Z(A,B)=0

VI. 1 2 3, ,HY HY HYS S S

{( ,0.3,0.3,0.4)},

{( ,0.4,0.4,0.2)},

{( ,0.3,0.4,0.3)},

{( ,0.4,0.3,0.3)}

A x

B x

C x

D x

1 1( , ) ( , ) 0.9HY HYS A B S C D 2 2( , ) ( , ) 0.85HY HYS A B S C D 3 3( , ) ( , ) 0.82HY HYS A B S C D

Z(A,B)=0.66 Z(C,D)=0.85