learning trajectory patterns by clustering: comparative evaluation

20
Learning Trajectory Patterns by Clustering: Comparative Evaluation Group D

Upload: brilliant

Post on 23-Feb-2016

44 views

Category:

Documents


0 download

DESCRIPTION

Learning Trajectory Patterns by Clustering: Comparative Evaluation. Group D. Problem Description & Definition. Problem Description & Definition. Preprocessing G rid Q uantization Clustering - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Learning Trajectory Patterns by Clustering: Comparative Evaluation

Group D

Page 2: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Problem Description & Definition

Page 3: Learning Trajectory Patterns by Clustering: Comparative Evaluation

• Preprocessing Grid Quantization

• Clustering

Distance/Similarity - modified Euclidean distance, dynamic time warping and longest common sequence

Clustering - bisection, Agglomerative and min-cut graph based with number of clusters predefined

• Clustering Validation Ground-truth based Hungarian Algorithm for matching clusters generated with ground-truth clusters

Problem Description & Definition

Page 4: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Preprocessing

Grid quantization s=2

•Normalization Grid Quantization

Page 5: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Preprocessing

Location 1

Location 2

Location 3

Location 4

•Computation Complexity ReductionEntry and Exit detection based on clustering starting and ending points of each trajectory (k-means clustering k=4)

Page 6: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Distance Metrics• Modified Euclidean Distance (m>n)

• Dynamic Time Warping

2nnmn

2n1n

2nn

222

211 )p(q)p(q)p(q...)p(q)p(q),(),( pqdqpd

DTW is used to compare unequal length signals by finding a time warping that minimizes the total distance between matching points

Page 7: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Distance Metrics• Longest Common Sub Sequence s1={a, b, c, d, e, f}; s2={b, d, e, f, m ,n} LCSS(s1,s2)={b, d, f}

where δ is a constant that controls how far we can look in the past and ε is a constant that controls the size of proximity in which we are looking for matches

Page 8: Learning Trajectory Patterns by Clustering: Comparative Evaluation

• Gaussian Kernel Function

Distance to Similarity Metrics

A similarity matrix S = {sij}, which represents a fully connected graph, is constructed from the trajectory distances using a Gaussian kernel function

Where D represents one of the distance measure defined previously and the parameter σ describes the trajectory neighborhood. Large values of σ cause further apart trajectories to have a higher similarity score while small values lead to a more sparse similarity matrix (more entries will be very small)

σ =0.1 σ =0.9 σ =2.1 σ =4.1 σ =7.1

DTW

Page 9: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Clustering Methods(CLUTO)• Divisive Divisive clustering is the top-down clustering where the entire trajectory training set is considered a single

cluster. The K clusters are obtained by performing K − 1 repeated bisections where each bisecting cluster split results an optimal 2-way division of the similarity matrix. In addition to ensuring local optimality a global optimization step is used to optimize the solution across all bisections.

• Agglomerative Agglomerative clustering is a bottom-up strategy that initially treats each trajectory as an individual cluster

and merges similar clusters hierarchically in a tree-like structure, stopping when only K clusters remain.

• Graph (min-cut) Similar to the divisive clustering method, graph methods seek to divide the full dataset into individual

clusters. Instead of operating directly on the similarity matrix, a nearest neighbor graph is constructed where a trajectory is a vertex. Each vertex is connected by a weighted edge to its most similar trajectories. The K clusters are found using a min-cut partitioning algorithm which finds a division of the graph with minimal loss of edge weights.

Page 10: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Clustering Validation

c1 c2

c3

Ground truth clusters

c2c1

c3

Clusters to evaluated

Hungarian Algorithms to maximize The number of clusters matched

Accuracy=n_matched/n_total

Page 11: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Evaluation• Dataset

• CLUTO CLUTO is a software package for clustering low- and high-dimensional datasets and for analyzing the characteristics of the various clusters. Standalone program scluster is utilized for clustering trajectories

1032 trajectories 18 clustersLankershim Dataset

Page 12: Learning Trajectory Patterns by Clustering: Comparative Evaluation

• How the size of Gaussian Kernel function influences the converting from distance matrix to similarity matrix:

σ should be large enough

Evaluation-Distance Metrics

DTW + Agglomerative

σ

accuracy

Page 13: Learning Trajectory Patterns by Clustering: Comparative Evaluation

• How the size of Gaussian Kernel function influences the converting from distance matrix to similarity matrix:

σ should be large enough

Evaluation-Distance Metrics

DTW + Divisive

accuracy

Page 14: Learning Trajectory Patterns by Clustering: Comparative Evaluation

• How the size of Gaussian Kernel function influences the converting from distance matrix to similarity matrix:

σ should be large enough

Evaluation-Distance Metrics

Modified_Euclidean + Divisive

Page 15: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Evaluation-Distance Metrics• How (δ, ε)parameters of LCSS influences the clustering results

δ

LCSS+ Graph

Page 16: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Evaluation-Clustering• How (δ, ε)parameters of LCSS influences the clustering results

ε

LCSS+ Graph

Page 17: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Evaluation-Clustering• Modified_Euclidean, DTW σ=7.1• LCSS δ=3, ε=8 d1-Modified Euclidean, d2-DTW, d3-LCSS c1-divisive, c2-agglomerative, c3-graph

Distance Metric d1 d1 d1 d2 d2 d2 d3 d3 d3Clustering c1 c2 c3 c1 c2 c3 c1 c2 c3Accuracy 0.83 0.57 0.822 0.977 0.83 0.917 0.956 0.91 0.959

Distance Computation Time(s)

0.0015 0.0015 0.0015 0.15 0.15 0.15 0.02 0.02 0.02

Clustering Computation Time(s)

2.859 0.359 0.297 2.782 0.375 0.305 3.031 0.328 0.532

Page 18: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Conclusion• Distance Metric Computation Complexity d1<d3<d2• Distance Metric Distiguishability d1<d2<d3• Clustering Capability c2<c3 c1• Clustering Computation Complexity c1<c3c2• Comprehensive performance d3(LCSS)+c3(graph) is the best combination

Page 19: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Demo

Page 20: Learning Trajectory Patterns by Clustering: Comparative Evaluation

Thanks