approximate nearest neighbors: towards removing the curse of dimensionality

13
Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality Piotr Indyk, Rajeev Motwa ni The 30 th annual ACM symposium on theory of computing 1998

Upload: hoyt-mccray

Post on 01-Jan-2016

44 views

Category:

Documents


5 download

DESCRIPTION

Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality. Piotr Indyk, Rajeev Motwani. The 30 th annual ACM symposium on theory of computing 1998. Problems. Nearest neighbor (NN) problem: - PowerPoint PPT Presentation

TRANSCRIPT

Approximate Nearest Neighbors: Towards Removing the Curse of

Dimensionality

Piotr Indyk, Rajeev Motwani

The 30th annual ACM symposium on theory of computing 1998

Problems

• Nearest neighbor (NN) problem:– Given a set of n points P={p1, …, pn} in some metric sp

ace X, preprocess P so as to efficiently answer queries which require finding the point in P closest to a query point qX.

• Approximate nearest neighbor (ANN) problem:– Find a point pP that is an –approximate nearest nei

ghbor of the query q in that for all p'P, d(p,q)(1+)d(p',q).

Motivation

• The nearest neighbors problem is of major importance to a variety of applications, usually involving similarity searching. – Data compression– Databases and data mining– Information retrieval– Image and video databases– Machine learning – Pattern recognition – Statistics and data analysis

• Curse of dimensionality– The curse of dimensionality is a term coined by Richard

Bellman to describe the problem caused by the exponential increase in volume associated with adding extra dimensions to a (mathematical) space.

Overview of results and techniques

• These results are obtained by reducing -NNS to a new problem: point location in equal balls.

nearest neighbor search (NNS)-nearest neighbor search (NNS)

Ring-Cover Trees

Point location in equal balls (PLEB)- Point location in equal balls (PLEB)

Locality-Sensitive Hashing

Proposition 1 Proposition 2

The Bucketing method

Proposition 3

Random projections

Content

Definitions

Theorems

Constructing Ring-cover trees

Analysis of Ring-cover trees

Definitions

Locality-Sensitive Hashing

The Bucketing method

• We decompose each ball into a bounded number of cells and store them in a dictionary.

• The bucketing algorithm works for any lp norm.

J. L. Lemma