spring 2016 lecture 4: segmentationkbuchin/teaching/2ima20/... · 2. prove that optimal solutions...
TRANSCRIPT
2IMA20
Algorithms for Geographic Data
Spring 2016
Lecture 4: Segmentation
Motivation: Geese Migration
Two behavioural types
stopover
migration flight
Input:
GPS tracks
expert description of behaviour
Goal: Delineate stopover sites of migratory geese
Abstract / general purpose questions
Single trajectory
simplification, cleaning
segmentation into semantically meaningful parts
finding recurring patterns (repeated subtrajectories)
Two trajectories
similarity computation
subtrajectory similarity
Multiple trajectories
clustering, outliers
flocking/grouping pattern detection
finding a typical trajectory or computing a mean/median
trajectory
visualization
Problem
For analysis, it is often necessary to break a trajectory into pieces
according to the behaviour of the entity (e.g., walking, flying, …).
Input: A trajectory T, where each point has a set of attribute values,
and a set of criteria.
Attributes: speed, heading, curvature…
Criteria: bounded variance in speed, curvature, direction,
distance…
Aim: Partition T into a minimum number of subtrajectories (so-called
segments) such that each segment fulfils the criteria.
“Within each segment the points have similar attribute values”
Criteria-Based Segmentation
Goal: Partition trajectory into a small number of segments
such that a given criterion is fulfilled on each segment
Example
Trajectory T sampled with equal time intervals
Criterion: speed cannot differ more than a factor 2?
Example
Trajectory T sampled with equal time intervals
Criterion: speed cannot differ more than a factor 2?
Criterion: direction of motion differs by at most 90⁰?
speed
Example
Trajectory T sampled with equal time intervals
Criterion: speed cannot differ more than a factor 2?
Criterion: direction of motion differs by at most 90⁰?
speedspeed
heading
Example
Trajectory T sampled with equal time intervals
Criterion: speed cannot differ more than a factor 2?
Criterion: direction of motion differs by at most 90⁰?
speedspeed
headingspeed &
heading
Decreasing monotone criteria
Definition: A criterion is decreasing monotone, if it holds on a
segment, it holds on any subsegment.
Examples: disk criterion (location), angular range (heading),
speed…
Theorem: A combination of conjunctions and disjunctions of
decreasing monotone criteria is a decreasing monotone criterion.
Greedy Algorithm
Observation:
If criteria are decreasing monotone, a greedy strategy works.
TU/e Prerequisites
Designing greedy algorithms
1. try to discover structure of optimal solutions: what properties do optimal solutions have ?
what are the choices that need to be made ?
do we have optimal substructure ?
optimal solution = first choice + optimal solution for subproblem
do we have greedy-choice property for the first choice ?
2. prove that optimal solutions indeed have these properties
prove optimal substructure and greedy-choice property
3. use these properties to design an algorithm and prove correctness
proof by induction (possible because optimal substructure)
TU/e Prerequisites
Lemma: Let k be the smallest index such that 𝜏 𝑡0, 𝑡𝑘 fails for a monotone
decreasing criterion 𝜏. If there is a solution, then there is an optimal solution
including 𝜏 𝑡0, 𝑡𝑘−1 as segment.
Proof. Let OPT be an optimal solution for A. If OPT includes ai then the
lemma obviously holds, so assume OPT does not include ai.
We will show how to modify OPT into a solution OPT* such that
(i) OPT* is a valid solution
(ii) OPT* includes ai
(iii) size(OPT*) ≤ size(OPT)
Thus OPT* is an optimal solution including 𝜏 𝑡0, 𝑡𝑘−1 , and so the lemma
holds. To modify OPT we proceed as follows.
standard text you can
basically use in proof for any
greedy-choice property quality OPT* ≥ quality OPT
here comes the modification, which is problem-specific
Greedy Algorithm
Observation:
If criteria are decreasing monotone, a greedy strategy works.
For many decreasing monotone criteria Greedy requires O(n) time,
e.g. for speed, heading…
Greedy Algorithm
Observation:
For some criteria, iterative double & search is faster.
Double & search: An exponential search followed by a binary search.
Criteria-Based Segmentation
Boolean or linear combination of decreasing monotone criteria
Greedy Algorithm
incremental in O(n) time or constant-update criteria e.g. bounds
on speed or heading
double & search in O(n log n) time for non-constant update
criteria
[M.Buchin et al.’11]
Motivation: Geese Migration
Two behavioural types
stopover
migration flight
Input:
GPS tracks
expert description of behaviour
Goal: Delineate stopover sites of migratory geese
Case Study: Geese Migration
Data
Spring migration tracks
White-fronted geese
4-5 positions per day
March – June
Up to 10 stopovers during spring migration
Stopover: 48 h within radius 30 km
Flight: change in heading <120°
Kees
Adri
Comparison
manual
computed
Evaluation
Few local differences:
Shorter stops, extra cuts in computed segmentation
A combination of decreasing and increasing monotone criteria
stopover migration flight
Criteria
Within
radius 30km
At least
48hAND
Change in
heading <120°OR
Decreasing and Increasing Monotone Criteria
Observation: For a combination of decreasing and increasing
monotone criteria the greedy strategy does not always work.
Example: Min duration 2 AND Max speed range 4
speed1
5
Non-Monotone Segmentation
Many Criteria are not (decreasing) monotone:
Minimum time
Standard deviation
Fixed percentage of outliers
For these Aronov et al. introduced the start-stop diagram
Example: Geese Migration
Start-Stop Diagram
input trajectory compute start-stop diagram compute segmentation
Algorithmic approach
Start-Stop Diagram
Given a trajectory T over time interval
I = {t0,…,t} and criterion C
The start-stop diagram D is (the upper
diagonal half of) the n x n grid,
where each point (i,j) is associated to
segment [ti,tj] with
• (i,j) is in free space if C holds on [ti,tj]
• (i,j) is in forbidden space if C does not
hold on [ti,tj]
A (minimal) segmentation of T
corresponds to a (min-link) staircase in D
free space
forbidden space
staircase
Start-Stop Diagram
A (minimal) segmentation of T corresponds
to a (min-link) staircase in D.
1234567891011121 2 3 4 5 6 7 8 9 10 11 12
12
11
10
9
8
7
6
5
4
3
2
1
1
3
2
4
9
7
5
8
10
6
12
11
Start-Stop Diagram
A (minimal) segmentation of T corresponds
to a (min-link) staircase in D.
1234567891011121 2 3 4 5 6 7 8 9 10 11 12
12
11
10
9
8
7
6
5
4
3
2
1
1
3
2
4
9
7
5
8
10
6
12
11
Start-Stop Diagram
Discrete case:
A non-monotone segmentation can
be computed in O(n2) time.
1
5
4
6
8
7
9 10
11
12
1234567891011121 2 3 4 5 6 7 8 9 10 11 12
12
11
10
9
8
7
6
5
4
3
2
1
Start-Stop Diagram
Discrete case:
A non-monotone segmentation can
be computed in O(n2) time.
1
5
4
6
8
7
9 10
11
12
1234567891011121 2 3 4 5 6 7 8 9 10 11 12
12
11
10
9
8
7
6
5
4
3
2
1
Stable Criteria
Definition:
A criterion is stable if and only if 𝑖=𝑜𝑛 𝑣 𝑖 = 𝑂(𝑛) where
𝑣 𝑖 = number of changes of validity on segments [0,i], [1,i],…, [i-1,i]
Stable Criteria
Definition:
A criterion is stable if and only if 𝑖=𝑜𝑛 𝑣 𝑖 = 𝑂(𝑛) where
𝑣 𝑖 = number of changes of validity on segments [0,i], [1,i],…, [i-1,i]
Observations:
Decreasing and increasing monotone criteria are stable.
A conjunction or disjunction of stable criterion are stable.
Compressed Start-Stop Diagram
For stable criteria the start-stop diagram can be compressed by applying
run-length encoding.
Examples:
increasing monotonedecreasing monotone
9
Computing the Compressed Start-Stop Diagram
For a decreasing criterion consider the algorithm:
ComputeLongestValid(crit C, traj T)
Algorithm:
Move two pointers i,j from n to 0 over the trajectory. For every
trajectory index j the smallest index i for which 𝜏 𝑡𝑖 , 𝑡𝑗 satisfies
the criterion C is stored.
i j
9
Computing the Compressed Start-Stop Diagram
For a decreasing criterion ComputeLongestValid(crit C, traj T):
Move two pointers i,j from n to 0
over the trajectory
i j
Requires a data structure for segment [ti,tj] allowing the operations
isValid, extend, and shorten, e.g., a balanced binary search tree on
attribute values for range or bound criteria.
Runs in O(nc(n)) time where c(n) is the time to update & query.
Analogously for increasing criteria.
9
Computing the Compressed Start-Stop Diagram
The start-stop diagram of a conjunction (or disjunction) of two stable
criteria is their intersection (or union).
The start-stop diagram of a
negated criteria is its inverse.
The corresponding compressed
start-stop diagrams can be
computed in O(n) time.
9
Attributes and criteria
Examples of stable criteria
Lower bound/Upper bound on attribute
Angular range criterion
Disk criterion
Allow an approximate fraction of outliers
…
Computing the Optimal Segmentation
Observation: The optimal segmentation for [0,i] is either one segment,
or an optimal sequence of segments for [0,j<i] appended with a
segment [j,i], where j is an index such [j,i] is valid.
Dynamic programming algorithm
for each row from 0 to n
find white cell with min-link
That is, iteratively compute a table S[0,n]
where entry S[i] for row i stores
last: index of last link
count: number of links so far
runs in O(n2) time
S
nil, 0nil, ∞
nil, ∞nil, ∞
nil, ∞nil, ∞
0, 0+1=10, 1
7, 28, 3
8, 3
10, 4
Computing the Optimal Segmentation
T
[Alewijnse et al.’14]
More efficient dynamic programming algorithm for compressed
diagrams
Process blocks of white cells using a range query in a binary search
tree T (instead of table S) storing
index: row index
last: index of last link
count: number of links so far
augmented by minimal count
in subtree
TU/e Prerequisites
Methodology for augmenting a data structure
1. Choose an underlying data structure.
2. Determine additional information to maintain.
3. Verify that we can maintain additional information
for existing data structure operations.
4. Develop new operations.
1. R-B tree
2. min-count[x]
3. theorem next
slide
4. count(block)
using range
query
TU/e Prerequisites
Theorem
Augment a R-B tree with field f, where f[x] depends only on
information in x, left[x], and right[x] (including f[left[x]] and
f[right[x]]). Then can maintain values of f in all nodes during
insert and delete without affecting O(log n) performance.
When we alter information in x, changes propagate only upward
on the search path for x …
Computing the Optimal Segmentation
T
[Alewijnse et al.’14]
More efficient dynamic programming algorithm for compressed
diagrams
Process blocks of white cells using a range query in a binary search
tree T (instead of table S) storing
index: row index
last: index of last link
count: number of links so far
augmented by minimal count
in subtree
runs in O(n log n) time
Beyond criteria-based segmentation
When is criteria-based segmentation applicable?
What can we do in other cases?
Excursion: Movement Models
Movement Models
Movement data: sequence
of observations, e.g. (xi,yi,ti)
Linear movement realistic?
Movement Models
Movement data: sequence
of observations, e.g. (xi,yi,ti)
Linear movement realistic?
Space-time Prisms
Movement Models
Movement data: sequence
of observations, e.g. (xi,yi,ti)
Linear movement realistic?
Space-time Prisms
Movement Models
Movement data: sequence
of observations, e.g. (xi,yi,ti)
Linear movement realistic?
Space-time Prisms
Movement Models
Movement data: sequence
of observations, e.g. (xi,yi,ti)
Linear movement realistic?
Space-time Prisms
Random Motion Models
(e.g. Brownian bridges)
Movement Models
Movement data: sequence
of observations, e.g. (xi,yi,ti)
Linear movement realistic?
Space-time Prisms
Random Motion Models
(e.g. Brownian bridges)
Brownian Bridges - Examples
Brownian bridges no bridges
Segment by diffusion coefficient
Summary
Greedy algorithm for decreasing monotone criteria O(n) or O(n log n)
time
[M.Buchin, Driemel, van Kreveld, Sacristan, 2010]
Case Study: Geese Migration
[M.Buchin, Kruckenberg, Kölzsch, 2012]
Start-stop diagram for arbitrary criteria O(n2) time
[Aronov, Driemel, van Kreveld, Löffler, Staals, 2012]
Compressed start-stop diagram for stable criteria O(n log n) time
[Alewijnse, Buchin, Buchin, Sijben, Westenberg, 2014]
5
References
S. Alewijnse, T. Bagautdinov, M. de Berg, Q. Bouts, A. ten Brink, K.
Buchin and M. Westenberg. Progressive Geometric Algorithms.
SoCG, 2014.
M. Buchin, A. Driemel, M. J. van Kreveld and V. Sacristan.
Segmenting trajectories: A framework and algorithms using
spatiotemporal criteria. Journal of Spatial Information Science,
2011.
B. Aronov, A. Driemel, M. J. Kreveld, M. Loffler and F. Staals.
Segmentation of Trajectories for Non-Monotone Criteria. SODA,
2013.
M. Buchin, H. Kruckenberg and A. Kölzsch. Segmenting
Trajectories based on Movement States. SDH, 2012.
5
References