relaxed reverse nearest neighbors queries arif hidayat muhammad aamir cheema david taniar
TRANSCRIPT
RELAXED REVERSE NEAREST NEIGHBORS QUERIES
Arif HidayatMuhammad Aamir CheemaDavid Taniar
Outline
Motivation Problem Definition Technique Experiment Conclusion
RkNN Query
R2NN of is
Nearest Neighbor Query (NN)– Find the object
closest to
Reverse Nearest Neighbor Query (RNN)– Find object which
consider as its NN
2 Nearest Neighbors (2NN) of are and
However, and are not Reverse 2 Nearest Neighbors (R2NN) of
Motivation
In R2NN query, is influenced by and
However, it is believed that is also influenced by
Normally, user will not mind to travel slightly farther to the next closest facility
In this case, RNN may miss influenced objects or retrieve non-influenced ones
u1f4
u2
f3
f2 f1
u330 Km
31 Km
Contribution
Complement RNN query with relative distance
New pruning techniques Extensive experimental
study
RRNN-Problem Definition
Given a set of users , a set of facilities , a query facility and a value of
an RRNN query returns every user for which where denotes the distance between and its nearest facility in
u1f4
u2
qf2 f1
u3
1 km
1.5 km
𝑥=1.5
RRNN-Pruning
Compute regions on which users cannot be RRNN of q
q
a
ce
P1P2
P3
P4 P5 P6
60ob
d
fu1
u2g
q
a
ce
b
d
fu1
u2g
New pruning rule
Six-regions and half-space pruning not applicable in RRNN problem
r
Point-based Pruning
𝒓=𝒙 .𝒅𝒊𝒔𝒕 (𝒒 ,𝒑)
𝒙𝟐−𝟏
Given a query , a value of and a point , the pruning circle of is a circle centered at with radius where
q p
is on the line passing through and
c
𝒅𝒊𝒔𝒕 (𝒒 ,𝒄 )= 𝒙𝟐 .𝒅𝒊𝒔𝒕 (𝒒 ,𝒑)𝒙𝟐−𝟏
Cp
and Proof:
u
Ɵ
Point-based Pruning
qc
b
p
u
Ɵ
Cpr
u’ The pruning rule is tight (proof is in the paper)
Given a query point , a user (outside ) and its nearest facility
–
– cannot be pruned by
MBR-based Pruning
Given a query , a value of , and a line representing a side of an MBR,
q
a ba' b'
CbCa
u
a user cannot be the RRNN of if it lies inside both of the pruning circles and ,
can be pruned if lies in
q
a b
Cb
Ca
cd
CcCd
Filtering
Prune users using defined pruning
regions
Straightforward approach: Store pruning regions in a
list Check user against entries
in the list O(n)
Our approach: Define interval for
each pruning region Build interval tree for
each partition Check users against
overlapped interval O(log n + k)
RRNN Candidates
Verification
Verify candidates: Circular boolean range query
on facility R*-Tree A user candidate is RRNN of if
no facility ,
More techniques: Computing
interval Trimming Cb
Ca
R
Rb
Ra
Rt
Ca
R
Ra
q
P1P2
P6
P3
P4P5
Ai
a
b
Ai.max
Ai.min
e1
e2
e3
Experiment Design
Implemented in C++
Run on Intel Core I5 2.3GHzx4 PC with 8GB memory running on Debian Linux
Users and facilities are indexed with R*-Tree
Each experiment runs 100 queriesParameter Values
Data size 2K, 200K, 2M, 20M
x factor 1.1, 1.3, 1.5, 2, 4
Real data set NA, LA, CA
13
Experiment Design
Synthetic and real data sets
175,812 points from North America (NA), 2.6 m points from Los Angeles (LA) and 25.6 m points from California (CA)
Data set: divided into 2 almost equal user and facility size
Improved range query
– For user and facility R*-Tree entry, , is immediately pruned if
– is not opened if
14
Experiment Results
No previous method for RRNN problem
We compare with naïve range query and improved algorithms
Experiment Results
Our algorithm is several orders of magnitude better than improved algorithm
Conclusion
An RRNN query relaxes the definition of influence using the relative distances between the users and the facilities
Our algorithm based on proposed effective pruning technique is several magnitude better than the competitors
Future works:
— Continuous RRNN
— Relaxed Reverse Top-