image retrieval with geometry-preserving visual phrases

Yimeng Zhang, Zhaoyin Jia and Tsuhan ChenCornell University

Image Retrieval with Geometry-Preserving Visual Phrases

Similar Image Retrieval

Ranked relevant images

Image Database

Bag-of-Visual-Word (BoW)

Images are represented as the histogram of words

Similarity of two images: cosine similarity of histograms

…Length: dictionary size

Geometry-preserving Visual Phrases length-k Phrase:: k words in a certain spatial layout

……

(length-2 phrases)Bag of Phrases:

Phrases vs. Words

Length-2

Length-3

Length-2

Length-3

Irrelevant Relevant

Previous Works

Geometry Verification

Searching Step with BoW

Post-processing (Geometry Verification)

Only on top ranked images

Encode Spatial Info

Modeling relationship between words

Co-occurrences in Entire image [L. Torresani, et al, CVPR 2009]

No spatial information

Phrases in a local neighborhoods [J. Yuan et al, CVPR07][Z. Wu et al., CVPR10]

[C.L.Zitnick, Tech.Report 07]

No long range interactions, weak geometry

Select a subset of phrases [J. Yuan et al, CVPR07]

Discard a large portion of phrases

……

(length-2 Phrase)

Dimension: exponential to # of words in Phrase

Previous works: reduce the number of phrases

Our work: All phrases, Linear computation time

Approach

Overview

BoW BoP

1. Similarity Measure

2. Large Scale Retrieval

InvertedFiles

Min-hash InvertedFiles Min-hash

[Zhang and Chen, 09]

This Paper

Co-occurring Phrases

Only consider the translation difference

Co-occurring Phrase Algorithm

Cxxx '

-2 -1 0 1 2 3 4

0-1-2-3-4

Offset space

# of co-occurring length -2 Phrases:

Relation with the feature vector

……

)(xk )(yk

)(),( yx kk

Inner product of the feature vectors

# of co-occurring length-k phrases)|||||(| 11 kkk YXO

M: # of corresponding pairs, in practice, linear to the number of local features

)(MO same as BOW!!!

Inverted Index with BoWAvoid comparing with every image

Score table

Image ID I1 I2 … InScore +1

Inverted Index

Inverted Index with Word Location

……

Assume same word only occurs once in the same image, Same memory usage as BoW

Score TableCompute # of Co-occurring Phrases:

Compute the Offset Space

Image ID I1 I2 … InScore

I1 I2 In

Inverted Files with Phrases

…Offset Space

+1 +1+1+1

I1 I10 …

I8 …

I5 …

……

Inverted Index

0,0 1,0

0,-1 1,-1-1,-1

…… …

Final Score

I1 I2 In

OffsetSpace

Image ID I1 I2 … InScore

Final similarity scores

Overview

BoW BoP

InvertedFiles

Min-hash InvertedFiles Min-hash

Less storage and time complexity

Min-hash with BoW

Probability of min-hash collision(same word)= Image Similarity

Min-hash with Phrases

Probability of k min-hash collision with consistent geometry(Details are in the paper)

Offset spacexxx '

-3 -2 -1 0 1 2

-1-2-3-4

Other Invariances

)ˆlog(s

''ˆssxxx

''ˆssyyy x y

Image I

Image I’

Add dimension to the offset spaceIncrease the memory usage

Variant MatchingLocal histogram matching

Evaluation

1. BoW + Inverted Index vs. BoP + inverted Index

2. BoW + Min-hash vs. BoP + Min-hash

Post-processing methods: complimentary to our work

Experiments –Inverted Index5K Oxford dataset (55 queries)1M flicker distracters

Philbin, J. et al. 07

Example Precision-recall curve

Higher precision at lower recall

BoWBoP

Recall

BoPBoW

RecallPr

ComparisonMean average precision: mean of the AP on 55 queries

0 100 200 300 400 500 600 700 800 900 10000.450

Vocabulary Size (K)

Outperform BoW (similar computation)Outperform BoW+RANSAC (10 times slower on 150 top images)Larger improvement on smaller vocabulary size

BoW BoW+RANSAC

BoP+RANSAC

+Flicker 1M Dataset

Computational ComplexityMethod Memory Runtime (seconds)

Quantization SearchBoW 8.1G 0.89s 0.137sBoP 8.5G 0.215s

BoW+RANSAC - 0.89s 4.137s

RANSAC: 4s on top 300 images

0 200 400 600 800 10000.4

0.450.5

0.550.6

0.65 BoWBoP

Number of Images

Experiment - min-hash

University of Kentucky dataset

Minhash with BoW: [O. Chum et al., BMVC08]

200 500 8002.80

BoWBoP

# of min-hash fun.

ConclusionEncode more spatial information into the BoW

Can be applied to all images in the database at the searching step

Same computational complexity as BoW

Better Retrieval Precision than BoW+RANSAC

image retrieval with geometry-preserving visual phrases

visual phrases length

image similarity

subset of phrases

phrasesbag of phrases

phrases abcabcdfdfaaefefzhang

cooccurring length

inverted files19minhash

bop inverted index bow

Documents

retrieval model overview boolean retrieval retrieval info...

media retrieval information retrieval image retrieval video...

privacy-preserving keyword search over … most relevant...

image retrieval with geometry-preserving visual...

privacy-preserving image retrieval for mobile devices ... -...

canning, preserving, drying and preserving eggs

tutormandarin survival phrases web · tutormandarin...

deep semantic-preserving and ranking-based …...

war on grammar. battles parallel structure, noun phrases,...

types of phrases prepositional phrases verbal phrases ...

verbal phrases: participle phrases

an ontology-based retrieval system using semantic … ·...

1996-07 natural-language retrieval of images based on...

uva-dare (digital academic repository) feature …...wordss...

semantics preserving hierarchy based retrieval of indian

canning, preserving, drying and preserving of eggs

positivity-preserving and asymptotic preserving method for...

chapter 21 – phrases prepositional, appositive, and verbal...

verb phrases and noun phrases in english

preserving families, preserving wealth brochure, may 2013