mining interesting locations and travel sequences from gps trajectories idb & ids lab. seminar...
TRANSCRIPT
Mining Interesting Locations and Travel Se-quences from GPS Trajectories
IDB & IDS Lab. Seminar
Summer 2009
강 민 석[email protected]
July 23rd, 2009
Yu Zheng, Lizhu Zhang, Xing Xie, Wei-Ying Ma
WWW 2009
Center for E-Business TechnologySeoul National UniversitySeoul, Korea
Microsoft Research Asia
Intelligent Database Systems Lab.
Copyright 2009 by CEBT
Abstract
Mining Interesting Locations and Travel Sequences from GPS Trajectories
GPS log : record users’ outdoor movements with GPS
By mining multiple users’ location histories,discover interesting locations and travel sequences in a given region
Problem
How to model multiple users’ location history from GPS log
How to infer the interest level of a location Location interest not only depend on the number of visiting, but also users’ travel expe-
riences.
How to detect classical sequences in a given region
2
timestamp Latitude longitude07-01
12:30:00N 33º 30’
19.5”E 126º 29’
35.3”07-01
12:30:30N 33º 30’
19.4”E 126º 29’
35.2”07-01
12:31:00N 33º 30’
19.2”E 126º 29’
35.3”07-01
12:31:30N 33º 30’
19.1”E 126º 29’
35.3”07-01
12:32:00N 33º 30’
19.1”E 126º 29’
35.4”
timestamp Latitude longitude07-01
12:30:00N 33º 30’
19.5”E 126º 29’
35.3”07-01
12:30:30N 33º 30’
19.4”E 126º 29’
35.2”07-01
12:31:00N 33º 30’
19.2”E 126º 29’
35.3”07-01
12:31:30N 33º 30’
19.1”E 126º 29’
35.3”07-01
12:32:00N 33º 30’
19.1”E 126º 29’
35.4”
timestamp Latitude longitude07-01
12:30:00N 33º 30’
19.5”E 126º 29’
35.3”07-01
12:30:30N 33º 30’
19.4”E 126º 29’
35.2”07-01
12:31:00N 33º 30’
19.2”E 126º 29’
35.3”07-01
12:31:30N 33º 30’
19.1”E 126º 29’
35.3”07-01
12:32:00N 33º 30’
19.1”E 126º 29’
35.4”
Contents
Introduction
Modeling Location History
Location Interest Inference
Experiments
Related Work
Conclusions
3
Copyright 2009 by CEBT
Introduction
GPS log Recently, many users record their outdoor movements with GPS.
Travel experience sharing, Life Logging, Sports activity
GPS devices are changing the way people interact with the Webby using locations as contexts.
4
Copyright 2009 by CEBT
Introduction
Architecture System comprises of three parts
Location history modeling, location interest & sequence mining, recom-mendation
8
Tree-Based Hierarchical Graph
HITS-Based Inference Model
User Travel Experience
Location Interest
Location History Modeling
Location Interest and Sequence Mining
Recommendation
ModelingLocation History
GPS Logs
Experienced Users
Interesting Locations
Travel SequencesMining TravelSequences
Location Recommender
Contents
Introduction
Modeling Location History
GPS Trajectory & Stay Point
Location History
Tree-Based Hierarchical Graph (TBHG)
Location Interest Inference
Experiments
Related Work
Conclusions
9
Copyright 2009 by CEBT
Modeling Location History
GPS Trajectory GPS point : contain (timestamp, latitude, longitude)
GPS log : a collection of GPS points
GPS trajectory : sequentially connect GPS points
Stay Point geographic region where a user stayed over a certain period time
interval
Time threshold T : stay over T (e.g. 20 min)
Distance threshold D : distance between two points is less than D (e.g. 200 m)
10
timestamp Latitude longitude
07-01 12:30:00
N 33º 30’ 19.5”
E 126º 29’ 35.3”
07-01 12:30:30
N 33º 30’ 19.4”
E 126º 29’ 35.2”
07-01 12:31:00
N 33º 30’ 19.2”
E 126º 29’ 35.3”
07-01 12:31:30
N 33º 30’ 19.1”
E 126º 29’ 35.3”
07-01 12:32:00
N 33º 30’ 19.1”
E 126º 29’ 35.4”
07-01 12:32:30
N 33º 30’ 19.1”
E 126º 29’ 35.4”
07-01 12:33:00
N 33º 30’ 19.2”
E 126º 29’ 35.4”
Copyright 2009 by CEBT
Modeling Location History
Location History represented as a sequence of stay points
with corresponding arrival and leaving times
11
S1S2
S3
S4S5
S6
S7
Home
Supermarket
Company
Restaurant
S8
S9 S10
Copyright 2009 by CEBT
Modeling Location History
Model multiple users’ location histories Location history of various people are inconsistent and incompara-
ble
stay points of different individuals are not identical
Considering the scale of location
12
A
B
S1S2
S3
S4S5
S6
S7
Home
Supermarket
Company
Restaurant
S8
S9 S10
C1C2
C3
C4
Copyright 2009 by CEBT
Modeling Location History
Tree-Based Hierarchy Build a tree using a hierarchical clustering algorithm
Density-based clustering algorithm OPTICS (Ordering Points to Identify the Clustering Structure)
Hierarchically cluster stay points into some geospatial regions
Different levels denote different geospatial granularity
13
Copyright 2009 by CEBT
Modeling Location History
Tree-Based Hierarchical Graph (TBHG)
1. Formulate a Tree-based Hierarchy
Hierarchically cluster stay points
2. Build Graphs on each Level
Link is generated when consecutive stay points are contained in two clus-ters
14
Copyright 2009 by CEBT
Modeling Location History
Tree-Based Hierarchical Graph (TBHG) location history can be represented by a sequence of stay point
clusters with transition time between two clusters on different geospatial scales
15
S1S2
S3
S4S5
S6
S7
Home
Supermar-ket
Com-pany
Restaurant
S8
S9S10
C1C2
C3
C4
S1S2
S3
S4 S5
S6S7 S8
S9 S10
A
B
Contents
Introduction
Modeling Location History
Location Interest Inference
HITS-Based Inference Model
Mining Classical Travel Sequences
Experiments
Related Work
Conclusions
16
Copyright 2009 by CEBT
Location Interest Inference
HITS (Hypertext Induced Topic Search) search query dependent ranking algorithm for Web IR
produce two rankings
Hub : web page with many out-links
Authority : web page with many in-links
Hub and Authority have a mutual reinforcement relationship
17
Copyright 2009 by CEBT
Location Interest Inference
HITS-Based Inference Model regard an user’s visit to a location as
an implicitly directed link from the user to that location
Hub and Authority
Hub : a user who has accessed many places → users’ travel experiences
Authority : a location which has been visited by many users → location interest
mutual reinforcement relationship
Users’ travel experiences (hub scores) & interest of locations (au-thority scores)
18
Copyright 2009 by CEBT
Location Interest Inference
Data Selection Strategy Motivation
User’s travel experience is region-related.
need to specify a geospatial region before conducting HITS-based infer-ence
Strategy
calculate scores using regions specified by their ascendant clusters
can have multiple authority and hub scores based on the different region scales
19
Copyright 2009 by CEBT
Location Interest Inference
Inference Build adjacent matrix between users and locations
mutual reinforcement relationship of user travel experience and location interest
Iterative process for generating the final results
Calculate authority and hub scores using the power iteration method
20
Copyright 2009 by CEBT
Mining Classical Travel Sequences
calculate Score for each Location Sequence the Travel Experiences of Users taking this sequence
Hub scores of the user
the Interests of the Locations contained in the sequence
Authority scores of the locations in this sequence
21
5 users have taken A→CWe know each user’s hub score.
What is the classical score of sequence A→C→D
TBHG We know location C’s authority score.
Copyright 2009 by CEBT
Mining Classical Travel Sequences
calculate Score for each Location Sequence the Travel Experiences of Users taking this sequence
Hub scores of the user
the Interests of the Locations contained in the sequence
Authority scores of the locations in this sequence
Authority scores are weighted based on the probability to take sequence
22
What is the classical score of sequence A→C→D
Authority score of location A
Hub score of Users
Probability of moving out from A to this sequence
Contents
Introduction
Modeling Location History
Location Interest Inference
Experiments
Related Work
Conclusions
23
Copyright 2009 by CEBT
Experimental Settings
GPS Data GPS devices to collect data
Users
107 users record their outdoor movements
get payments based on the distance of GPS log
Data
mostly in China, some in the USA, Korea, Japan
1 year (from May 2007 to Oct. 2008)
5 million GPS points (166,372 km)
Parameter Stay Point
extracted 10,354 stay points
Clustering
159 clusters (4th level TBHG)
24
Copyright 2009 by CEBT
Evaluation Approaches
Evaluation Explore effectiveness of location & travel recommendation by a user study
29 subjects who have been in Beijing for more that 6 years
Two Aspects of Evaluation Presentation
the ability of the retrieved interesting locations in presenting a given region
Representative, Comprehensive, Novelty
Rank
The ranking performance of the retrieved locations based on relative interests
User Desirability Rating on each location & each sequence
employ two criteria – nDCG and MAP
Baseline Interesting Locations
rank-by-count, rank-by-frequency
Classical Travel Sequences
rank-by-count, rank-by-interests, rank-by-experience
25
Copyright 2009 by CEBT
Experimental Results
Results outperformed baseline approaches
Investigations Advantages of the hierarchy of the TBHG
Help users understand the region step-by-step (level-by-level)
can be used to specify users’ travel experiences in different regions
26
Contents
Introduction
Modeling Location History
Location Interest Inference
Experiments
Related Work
Mining Location History
Location Recommenders
Conclusions
27
Copyright 2009 by CEBT
Related Work
Mining Location History Individual location history
Detect significant locations of a user
Predict user’s movement
Recognize user-specific activities at each location
Multiple users’ location history
Mining similar sequences
Predict where a driver may be going
Recognize the social pattern in daily user activity
28
Copyright 2009 by CEBT
Related Work
Location Recommenders Recommenders based on real-time location
Mobile Tourist Guide System
Recommenders based on location history
More Personalized recommendation using location history
Recommend geographic locations like shops or restaurants
Enhance collaborative filtering solution
29
Contents
Introduction
Modeling Location History
Location Interest Inference
Experiments
Related Work
Conclusions
30
Copyright 2009 by CEBT
Conclusion
Mining Interesting Locations and Travel Sequences from GPS
propose a tree-based hierarchical graph (TBHG), which can model multiple users’ location history
propose a HITS-based model to infer users’ travel experiences and interest of a location within a region
consider users’ travel experiences and location interests, and mine travel se-quences
evaluate methodology using large GPS dataset
31
Tree-Based Hierarchical Graph
HITS-Based Inference Model
User Travel Experience
Location Interest
Location History Model-ing
Location Interest and Sequence Mining Recommendation
ModelingLocation History
GPS Logs
Experienced Users
Interesting Locations
Travel SequencesMining TravelSequences
Location Recommender
Copyright 2009 by CEBT
Conclusion
Implications
Help understand the correlation between users and locations
Enable location and travel recommendation
Step towards enhancing mobile Web from multiple users’ location histories
Improve location-based services by integrating social networking into mobile Web
GeoLife project
Building social networks using human location history
a location-based social-networking service on Microsoft Virtual Earth.
enables users to share life experiences and build connections among each other using human location history.
32
Copyright 2009 by CEBT
Discussion
Discussion about this paper (talked with Sungchan) Modeling Location History
Stay point detection is simple and easy to apply
Hierarchy model is appropriate to zoom in/out map
HITS-based Location Interest Inference
Pretty Reasonable : consider user’s travel experience is better than rank-by-count
But, try another way to find location interest and user travel experience
Travel Sequence
too naïve for calculating sequence score
Motivation Context-aware Service
Time + Location
33
Copyright 2009 by CEBT
References
This Slide Some Images from
GeoLife : Building social networks using human location history, Microsoft Research
Y. Zheng, Mining Individual Life Pattern Based on Location History: A Paradigm and Framework, Slide, 2009
References [5], [7], [14], [18]
GeoLife Project Paper
Yu Zheng and Xing Xie, Mining Individual Life Pattern Based on Location History, IEEE, 2009
Yu Zheng, Xing Xie, and Wei-Ying Ma, GeoLife2.0: A Location-Based Social Networking Service, IEEE, 2009
Yu Zheng, Xing Xie, and Wei-Ying Ma, Mining Interesting Locations and Travel Sequences From GPS Trajectories, ACM, 2009
Quannan Li, Yu Zheng, Xing Xie, and Wei-Ying Ma, Mining user similarity based on location history, ACM, 2008
Yu Zheng, Xing Xie, and Wei-Ying Ma, Understanding mobility based on GPS data, ACM, 2008
Yu Zheng and Xing Xie, Learning Transportation Mode from Raw GPS Data for Geographic Application on the Web, ACM, 2008
34