complexity of iff. an (curved) edge: vertices: only

1
Complexity of iff. An (curved) edge: Vertices: Only Pankaj K. Agarwal, Boris Aronov, Sariel Har-Peled, Jeff M. Phillips, Ke Yi, and Wuzhou Zhang Nearest-Neighbor Searching Under Uncertainty II Model and Qualification Probability Motivation Prior Work Computing Future Work ACM SIGMOD–SIGACT–SIGART Symposium on PRINCIPLES OF DATABASE SYSTEMS (PODS 2013) The PNN problem under the existential model The non-zero NN definition does not make sense Solutions here cannot be directly adapted Nonzero NNs Uncertain point : represented as a probability density function in : the pdf of : any given query point : the pdf of : the cdf of The qualification probability Data location is imprecise… Sensor databases Face recognition Mobile data What is the “nearest neighbor” of now? Acknowledgements P. Agarwal and W. Zhang are supported by NSF under grants CCF-09-40671, CCF-10- 12254, and CCF-11-61359, by ARO grants W911NF-07-1-0376 and W911NF-08-1-0452, and by an ERDC contract W9132V-11-C- 0003. B. Aronov is supported by NSF grants CCF-08-30691, CCF-11-17336, and CCF-12-18791, and by NSA MSP Grant H98230-10-1-0210. S. Har-Peled is supported by NSF grants CCF-09-15984 and CCF-12-17462. Nonzero NNs. in the case of disks: [Evans et al. 2008] Voronoi-based heuristics [Zhang et al. 2013] Computing Best-effort based [Kriegel et al. 2007][Cheng et al. 2008] Other variants. Expected Nearest Neighbor [Agarwal et al. 2012] Superseding Nearest Neighbor [Yuen et al. 2010] Top- NNs [Ljosa et al. 2007][Beskales et al. 2008] Complexity of if assuming general disks. if pairwise disjoint disks of same radii. if has locations. In all the cases, where , and is the output size. Monte Carlo method The number of instantiations is . If each has a discrete pdf of size : , with probability at least Spiral Search method Only need to look at a small number of closest points! Each has equally likely locations. Estimate using closest points. Independent of ! Indexing schemes (using less space) If each uncertainty region is a disk, If each has possible locations, Two sub-problems: Nonzero NNs. Nonzero Voronoi Diagram for any ⊆ , Computing Nearest Neighbor (NN) Searching Post office problem : a set of points : any query point Find the closest one Probabilistic Nearest Neighbor (PNN)

Upload: torgny

Post on 06-Feb-2016

21 views

Category:

Documents


0 download

DESCRIPTION

ACM SIGMOD–SIGACT–SIGART Symposium on PRINCIPLES OF DATABASE SYSTEMS (PODS 2013). Nearest-Neighbor Searching Under Uncertainty II. Probabilistic Nearest Neighbor (PNN). Motivation. Indexing schemes (using less space) If each uncertainty region is a disk , - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Complexity of   iff.   An (curved) edge:  Vertices:  Only

Complexity of

iff. An (curved) edge: Vertices: Only

𝑞

Pankaj K. Agarwal, Boris Aronov, Sariel Har-Peled, Jeff M. Phillips, Ke Yi, and Wuzhou ZhangNearest-Neighbor Searching Under Uncertainty II

Model and Qualification Probability

Motivation

Prior Work Computing

Future Work

ACM SIGMOD–SIGACT–SIGART Symposium on PRINCIPLES OF DATABASE SYSTEMS (PODS 2013)

The PNN problem under the existential model• The non-zero NN definition does not make sense• Solutions here cannot be directly adapted

Nonzero NNsUncertain point : represented as a probability density function

in : the pdf of 𝑞: any given query point : the pdf of : the cdf of

The qualification probability

Data location is imprecise…• Sensor databases• Face recognition• Mobile data

What is the “nearest neighbor” of now?𝑞

Acknowledgements

P. Agarwal and W. Zhang are supported by NSF under grants CCF-09-40671, CCF-10-12254, and CCF-11-61359, by ARO grants W911NF-07-1-0376 and W911NF-08-1-0452, and by an ERDC contract W9132V-11-C-0003. B. Aronov is supported by NSF grants CCF-08-30691, CCF-11-17336, and CCF-12-18791, and by NSA MSP Grant H98230-10-1-0210. S. Har-Peled is supported by NSF grants CCF-09-15984 and CCF-12-17462.

Nonzero NNs. • in the case of disks: [Evans et al. 2008]• Voronoi-based heuristics [Zhang et al. 2013] Computing • Best-effort based [Kriegel et al. 2007][Cheng et al.

2008] Other variants. • Expected Nearest Neighbor [Agarwal et al. 2012]• Superseding Nearest Neighbor [Yuen et al. 2010]• Top- NNs [Ljosa et al. 2007][Beskales et al. 2008]

Complexity of • if assuming general disks.• if pairwise disjoint disks of same radii.• if has locations.In all the cases,

where , and is the output size.

Monte Carlo methodThe number of instantiations is .If each has a discrete pdf of size :

, with probability at least Spiral Search method

Only need to look at a small number of closest points! Each has equally likely locations. 𝑘Estimate using closest points. 𝑚

Independent of !𝑛

Indexing schemes (using less space)• If each uncertainty region is a disk,

• If each has possible locations,

Two sub-problems: Nonzero NNs.

Nonzero Voronoi Diagram for any 𝒯⊆𝒫, Computing

Nearest Neighbor (NN) Searching

Post office

problem

𝑆: a set of points𝑞: any query point

Find the closest one

Probabilistic Nearest Neighbor (PNN)