trajectory simplification: on minimizing the direction-based error

Trajectory Simplification: On Minimizing the Direction-based ErrorCheng Long, Hong Kong University of Science and

TechnologyRaymond Chi-Wing Wong, Hong Kong University of Science

and TechnologyH. V. Jagadish, University of Michigan

1

Sample A Movement

2

Position (8:01AM)

Position (8:02AM)

p1 p2

p3

p4

p5 p6

p7

p8

p9

p10

Position(8:09AM)

Position(8:10AM)

10 positionsPosition (8:03AM)

Consider the case where - we have a moving object o - we want to sample the positions of o every one

minute

Trajectory

3

Segment

Segmentp1 p2

p3

p4

p5 p6

p7

p8

p9

p10

Segment

Segment

Segment9 segments

10 positions

Trajectory

4

Trajectory

p1 p2

p3

p4

p5 p6

p7

p8

p9

p10

9 segments

10 positions

Trajectory

5

Raw trajectory data is usually very large

For an example,- 10,000 taxis- Sampling rate: 5s- positions per day!!

Issue 1: Storing all sampled positions incurs a very high space costIssue 2: Query processing big trajectory data incurs high time cost

Trajectory Simplificatio

nDrop some positions

As a result, only a portion of the positions is kept

Trajectory Simplification

6

Trajectory Simplification:Drop some positions

Trajectory

p1 p2

p3

p4

p5 p6

p7

p8

p9

p10

Suppose p2, p4, p5, p7, p8, p9 are dropped

Trajectory6 positions to be dropped


7


Simplified trajectory

p1

p3

p6

p10


8


Trajectory

Trajectory6 positions to be dropped

p1 p2

p3

p4

p5 p6

p7

p8

p9

p10

Suppose p2, p3, p5, p6, p7, p9 are dropped


9


Another Simplified trajectory

p1

p4

p8

p10


10


Depending which positions to be dropped, it returns different simplified trajectories

Which positions should be dropped?

Our Idea:Drop the positions such that the “direction information” of the original trajectory is preserved

Direction-Preserving Trajectory Simplification (DPTS)


11


What is “direction

information”?

Why to preserve

“direction information”?

How to preserve


What Is “Direction Information”?

12

Trajectory

p1 p2

p3

p4

p5 p6

p7

p8

p9

p10

What Is “Direction Information”?

13

Direction information


14



information”?

Why to preserve


How to preserve


Why To Preserve “Direction Information”?

15

Applications That Use “Direction Information”Trajectory

Query Processing

Trajectory Mining

Map Matchin

g Trajectory

Clustering

Trajectory Classificati

on

Trajectory Outlier Detectio

nLee et al. SIGMOD’07, Hung et al. VLDBJ’11

Lee et al. VLDB’08

Brakatsoulas et al. IDEAS’ 04, Pelekis et al.

TIME’07

Brakatsoulas et al.

VLDB’05 Lee et al. ICDE’08

Cluster 2Cluster 1

Trajectory Clustering

16

Trajectory Clustering:Partition a set of trajectories into several clusters such that

- trajectories in the same cluster are similar- trajectories in different clusters are dissimilar

T1 T2 T3 T4

T1 and T2 are similar

T3 and T4 are similar

Trajectory Clustering

17

Cluster 2Cluster 1T1 T2 T3 T4

T1 and T2 have similar “direction information”

T3 and T4 have similar “direction information”

Trajectory

ClusteringLee et al.

SIGMOD’07, Hung et al. VLDBJ’11

Why To Preserve “Direction Information”?

18

Applications That Use “Direction Information”Trajectory

Query Processing

Trajectory Mining

Map Matchin

g Trajectory

Clustering

Trajectory Classificati

on

Trajectory Outlier Detectio

nLee et al. SIGMOD’07, Hung et al. VLDBJ’11

Lee et al. VLDB’08

Brakatsoulas et al. IDEAS’ 04, Pelekis et al.

TIME’07

Brakatsoulas et al.

VLDB’05 Lee et al. ICDE’08Existing trajectory simplification methods:

- Pay NO attention on preserving the “direction information”

- Preserve the “position information” in most cases


19



information”?

Why to preserve


How to preserve


How To Preserve “Direction Information”?

20

Original trajectory: T Simplified trajectory: T’

p1 p2

p3

p4

p5 p6

p7

p8

p9

p10

Direction-based Error Measurement

p4

p5

p3

p6p1 p2

p3

p6

p7

p8

p9

p10


21

Direction-based Error Measurement

0

𝜋 /2

𝜋3𝜋 /2 0

𝜋 /2

𝜋3𝜋 /2

Error of segment p1-p3



Error of T’ = the max of its segments’ errors

0

𝜋 /2

𝜋3𝜋 /2


22

Problem (Min-Size):Given:

- A trajectory T;- An error tolerance \eps;

Goal: A simplified trajectory T’ s.t.

- T’ has its error at most \eps;

- T’ has the fewest positions

Problem (Min-Error):Given:

- A trajectory T;- A budget W;


- T’ has at most W positions;

- T’ has the smallest error

VLDB 2013

Suitable when the knowledge of \eps is available

Suitable when the knowledge of \eps is NOT available, but a budget is


23

Algorithms For Min-ErrorError-Search

Span-Search

DP

Exact Algorithms

Approximate Algorithm

DP: A Sub-Problem Optimality Property

24

p1 p2

p3

p4

p5 p6

p7

p8

p9

p10

Problem Instance 1:- Trajectory: p1-p2-…-

p10- Budget: 4

p1-p3-p6-p10 is the optimal solution for Problem Instance 1

Problem Instance 2:

DP: A Sub-Problem Optimality Property

25

p7

p8

p9

p10

p1 p2

p3

p4

p5 p6


p6- Budget: 3


p10- Budget: 4

p1-p3-p6-p10 is the optimal solution for Problem Instance 1

Sub-problem of Problem Instance 1

p1-p3-p6 is the optimal solution for Problem Instance 2

Could be verified by contradiction


26


Span-Search

DP

Exact Algorithms


Time: O(W n3)Space: O(n2)

Feasible simplification:A simplified trajectory with at most W positions

Error-Search: A Rough Idea

27

p1 p2

p3

p4

p5 p6

p7

p8

p9

p10

All possible “feasible simplifications”:

p1-p4-p7-p10



Goal: A simplified trajectory T’ s.t.- T’ has at most W positions;- T’ has the smallest error

Goal: - A feasible simplification with

the smallest error

W = 4


28

p1 p2

p3

p4

p5 p6

p7

p8

p9

p10



p1-p4-p7-p10P1-p4-p6-p10





the smallest error

W = 4


29

p1 p2

p3

p4

p5 p6

p7

p8

p9

p10



p1-p4-p7-p10P1-p4-p6-p10p1-p3-p6-p10





the smallest error

W = 4


30

p1 p2

p3

p4

p5 p6

p7

p8

p9

p10



p1-p4-p7-p10P1-p4-p6-p10p1-p3-p6-p10…

Corresponding errors (in radians):

1.3051.3050.785…

The smallest error

Optimal solution





the smallest error

W = 4


31

p1 p2

p3

p4

p5 p6

p7

p8

p9

p10



p1-p4-p7-p10P1-p4-p6-p10p1-p3-p6-p10…

Corresponding errors (in radians):

1.3051.3050.785…

Size: O( nW) Size: bounded by O( n2)

Multiple simplifications share the same error

O(n2) possible segments in a simplified trajectory





the smallest error

W = 4


32


Span-Search

DP

Exact Algorithms


Time: O(W n3)Space: O(n2)

Time: O(C n2 log n)Space: O(n2)

2-factor approximation

Time: O(n log2 n)Space: O(n)

Span-Search: The Main Idea

33

Original trajectory: T Simplified trajectory: T’

p1 p2

p3

p4

p5 p6

p7

p8

p9

p10

The “span” of a simplified trajectory

p4

p5

p3

p6p1 p2

p3

p6

p7

p8

p9

p10


340

𝜋 /2

𝜋3𝜋 /2 0

𝜋 /2

𝜋3𝜋 /2

Span of p1-p2-p3

Span of p3-p4-p5-p6

Span of p6-p7-p8-p9-p10

Span of T’ = the max of its sub-trajectories’ spans

The “span” of a simplified trajectory

0

𝜋 /2

𝜋3𝜋 /2


35

A Property: A simplified trajectory has its error

- at most its span and- at least half its span

0

𝜋 /2

𝜋3𝜋 /2

0

𝜋 /2

𝜋3𝜋 /2


Error of segment p3-p6 0

𝜋 /2

𝜋3𝜋 /2


0

𝜋 /2

𝜋3𝜋 /2 0

𝜋 /2

𝜋3𝜋 /2

Span of p1-p2-p3

Span of p3-p4-p5-p6

0

𝜋 /2

𝜋3𝜋 /2

Span of p6-p7-p8-p9-p10


36

The feasible simplification with the

smallest span

The feasible simplification with the

smallest error

A trajectory

Error_1 ½ Span_1

Span-Search

DP Error-Search




- T’ has at most W positions;

- T’ has the smallest error

Feasible simplificationTime: O(W

n3) or O(C n2 log n)Space: O(n2)

Span_1Error_1 Span_2Error_2

2-factor approximation Time: O(n log2

n)Space: O(n)

Span_1 Span_2 Span_2 Error_2Error_1 ½ Error_2

Experiments: Settings Datasets:

Geolife and T-Drive Algorithms

Exact: DP and Error-Search Approximate: Span-Search and Douglas-Peucker

Experiments DPTS vs. Wavelet Transformation Performance Studies

37

# of trajectories

Total # of

positions

Average # of

positions / trajectory

Geolife 18K 25M 1KT-Drive 10K 18M 2K

Most well-known trajectory simplification algorithm

Experiments: Min-Error vs. Wavelet Transformation

38

ErrorBudget W (in terms of %)

Experiments: Performance Studies on Exact Algorithms (1)

39

Running time# of positions in the

trajectory

Experiments: Performance Studies on Exact Algorithms (2)

40

Running timeBudget W (in terms of

%)

Experiments: Performance Studies on Approximate Algorithms

41

Appro. factorBudget W (in terms of

%)

Run. timeBudget W (in terms of %)

Conclusion A New Problem: Min-Error Two Exact Algorithms: DP and Error-Search One 2-factor Approximate Algorithm: Span-

Search Experiments

42

Q & A

43

trajectory simplification: on minimizing the direction-based error

Documents