similarity search in high dimensions via hashing (original lsh with hamming distance)

Upload: xml

Post on 06-Apr-2018

225 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Similarity Search in High Dimensions via Hashing (Original LSH With Hamming Distance)

    1/12

  • 8/3/2019 Similarity Search in High Dimensions via Hashing (Original LSH With Hamming Distance)

    2/12

  • 8/3/2019 Similarity Search in High Dimensions via Hashing (Original LSH With Hamming Distance)

    3/12

  • 8/3/2019 Similarity Search in High Dimensions via Hashing (Original LSH With Hamming Distance)

    4/12

  • 8/3/2019 Similarity Search in High Dimensions via Hashing (Original LSH With Hamming Distance)

    5/12

  • 8/3/2019 Similarity Search in High Dimensions via Hashing (Original LSH With Hamming Distance)

    6/12

  • 8/3/2019 Similarity Search in High Dimensions via Hashing (Original LSH With Hamming Distance)

    7/12

    0 1000 2000 3000 4000 5000 6000 7000 80000

    0.2

    0.4

    0.6

    0.8

    1

    Interpoint distance

    Normalized

    frequency

    Point set distance distribution

    0 100 200 300 400 500 600 7000

    0.2

    0.4

    0.6

    0.8

    1

    Normalizedfrequency

    Interpoint distance

    Texture data set point distribution

  • 8/3/2019 Similarity Search in High Dimensions via Hashing (Original LSH With Hamming Distance)

    8/12

    1 2 3 4 5 6 7 8 9 100

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    Number of indices

    Error

    alpha=2, n=19000, d=64, k=700

  • 8/3/2019 Similarity Search in High Dimensions via Hashing (Original LSH With Hamming Distance)

    9/12

    0.1 0.2 0.5 1 1.9

    x 104

    0

    5

    10

    15

    20

    Number of database points

    Diskacc

    esses

    alpha = 2, 1NNS

    SRTree

    LSH, error=.02

    LSH, error=.05

    LSH, error=.1

    LSH, error=.2

    0.1 0.2 0.5 1 1.9

    x 104

    0

    5

    10

    15

    20

    25

    30

    35

    40

    Number of database points

    Diskaccesses

    alpha = 2, 10NNS

    SRTree

    LSH, error=.02

    LSH, error=.05

    LSH, error=.1

    LSH, error=.2

    0.1 0.2 0.5 1 1.9

    x 104

    0

    0.05

    0.1

    0.15

    0.2

    0.25

    Number of database points

    Miss

    ratio

    alpha=2, n=19000, d=64, 1NNS

    Error=.05Error=.1

    0.10.2 0.5 1 1.9

    x 104

    0

    0.1

    0.2

    0.3

    0.4

    0.5

    Number of database points

    Missratio

    alpha=2, n=19000, d=64, 10NNS

    Error=.05Error=.1

  • 8/3/2019 Similarity Search in High Dimensions via Hashing (Original LSH With Hamming Distance)

    10/12

    8 27 640

    2

    4

    6

    8

    10

    12

    14

    16

    18

    20

    Dimensions

    Diskacc

    esses

    alpha = 2, 1NNS

    SRTree

    LSH, error=.02

    LSH, error=.05

    LSH, error=.1

    LSH, error=.2

    8 27 640

    5

    10

    15

    20

    25

    30

    35

    Dimensions

    Diskaccesses

    alpha = 2, 10NNS

    SRTree

    LSH, error=.02

    LSH, error=.05

    LSH, error=.1

    LSH, error=.2

    1 10 20 50 1000

    5

    10

    15

    Number of nearest neighbors

    Diskacc

    esses

    alpha=2, n=19000, d=64

    Error=.05

    Error=.1

    Error=.2

  • 8/3/2019 Similarity Search in High Dimensions via Hashing (Original LSH With Hamming Distance)

    11/12

    10 15 20 25 30 35 40 45 500

    50

    100

    150

    200

    250

    300

    350

    400

    450

    Error (%)

    Diskaccesses

    Performance vs error

    SRTreeLSH

    0 50 100 150 2000

    200

    400

    600

    800

    1000

    1200

    1400

    NumberofDiskAcces

    ses

    Data Set Size

    SRTreeLSH

  • 8/3/2019 Similarity Search in High Dimensions via Hashing (Original LSH With Hamming Distance)

    12/12