fast and compact retrieval methods in computer vision
DESCRIPTION
Fast and Compact Retrieval Methods in Computer Vision. Rahul Garg Xiao Ling. Objective. phrase. Given a Query image, find all instances of that object in an image database. On the web. Objective. phrase. Given a Query image, find all instances of that object in an image database. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/1.jpg)
Fast and Compact Retrieval Methods in Computer Vision
Rahul GargXiao Ling
![Page 2: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/2.jpg)
Objective
• Given a Query image, find all instances of that object in an image database
phrase
On the web
![Page 3: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/3.jpg)
Objective
• Given a Query image, find all instances of that object in an image database
phrase
On the web
World Wide Web
“search this”
![Page 4: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/4.jpg)
Text Search Overview
![Page 5: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/5.jpg)
Document Representation
Parse into words vocabulary
Vector of frequencies of
words
(0,3,4,0,0,5,6,0,……….,1)
![Page 6: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/6.jpg)
Document Representation
(0,3,4,0,0,5,6,0,……….,1)K dimensional vectorK : number of words in vocabulary
![Page 7: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/7.jpg)
Document Representation: Example
The quick brown fox jumps over the brown dog
QuickBrownFoxJumpOverDogCat
Vocabulary
1 2 1 1 1 1 0
Quick Brown Fox Jump Over Dog Cat
![Page 8: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/8.jpg)
Document Representation: Weighted Frequencies
• Uncommon words are given more weight
Weight( wordi) α log(1/number of occurrences of wordi in the whole database)
1 2 1 1 1 1 0
Quick Brown Fox Jump Over Dog Cat
Term frequency – inverse document frequency (tf-idf)
![Page 9: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/9.jpg)
Querying
• Reduce query to vector form
• Find “nearest” document vectors
quick brown fox (1,1,1,0,0,0,0)
![Page 10: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/10.jpg)
Text Retrieval Applied to Object Retrieval in Images
• Video Google, Sivic et. al. ICCV 2003
![Page 11: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/11.jpg)
Text Retrieval vs Object Retrieval
Documents Images
Words
Text Retrieval Object Retrieval
![Page 12: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/12.jpg)
“Visual Words”
• Idea: Regions of image which are easy to detect and match
Build descriptor(representation)of the region
![Page 13: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/13.jpg)
Feature Descriptors
• Issues– Illumination changes– Pose changes– Scale….
• Video Google uses SIFT
( 130,129,….,101)
![Page 14: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/14.jpg)
Problem: Visual Words are noisy
Descriptors may turn out to be slightly different
Solution: Quantize!
![Page 15: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/15.jpg)
Building Vocabulary of Visual Words
Throw in all descriptorsfrom the database
Cluster Into K Visual words usingK means
Vocabulary of K visual words
![Page 16: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/16.jpg)
Image RepresentationAnalogous to Document Representation
Find Descriptors
MapDescriptors
To nearest visual wordsFrequencyVector
(0,3,4,0,0,5,6,0,……….,1)
![Page 17: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/17.jpg)
Querying
(0,3,4,0,0,5,6,0,……….,1)
Find similar vectors (images) in the database
![Page 18: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/18.jpg)
Finding Similar Vectors
• Problem: Number of Vectors is large
• Vectors are sparse: index using words
Word 1 List of images containing word1
Word 2 …..
Inverted Index Files
![Page 19: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/19.jpg)
Stop Lists
• Text Retrieval: Remove common words from vocabulary
• Analogy: Remove common visual words from vocabulary
is, are, the, that,…
![Page 20: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/20.jpg)
Spatial Consistency Ranking
• Text Retrieval: Increase ranking of results where search words appear close
The quick brown fox jumps over the lazy dog
Fox news: How to make brownbrownies quickly>
Search: quick brown fox
![Page 21: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/21.jpg)
Spatial Consistency Ranking
• More relevant in case of images: visual words need to be in same configuration
T
![Page 22: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/22.jpg)
Spatial Consistency Ranking in Video Google
![Page 23: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/23.jpg)
Performance Evaluation Metrics
K results returned by queryc: correct resultsN: total number of correct results in the database
• Precision: c/K• Recall: c/N
Increase K gradually to generate (Precision, Recall) pairs till K = N (Recall = 1.0)
![Page 24: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/24.jpg)
Performance Evaluation Metrics
0.1 0.2 0.3 0.4 0.5
0.600000000000001
0.700000000000001 0.8 0.9 10
0.20.40.60.8
11.2
Precision-Recall Curve
Precision
RecallArea Under Curve:Average Precision (AP)
![Page 25: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/25.jpg)
Video Google: SummaryFind Descriptors
Learn Vocabulary
Assign descriptors to words
Find Similar vectors
Rank Results
SIFT
K MeansO(Nk)
Linear SearchO(k)
Inverted Files
Loose Spatial Consistency
VocabBuilding
QueryStage
We need MORE words!
K = ~6K – 10K
![Page 26: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/26.jpg)
Tree Structure
• Tree structure for– searching – indexing
Incorporate inner nodes for scoring
iii wnq
Term Frequency
Inverted Document Frequency
![Page 27: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/27.jpg)
Hierarchical K-Means [Nistér et al, CVPR’06]
• K – branching factor• Time complexity:– O(N log (# of leaves)) for construction– O(log (# of leaves)) for searching
• Cons:– Wrong nearest neighbors assignment– Suffer from bad initial clusters
![Page 28: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/28.jpg)
Hierarchical K-Means: SummaryFind Descriptors
Learn Vocabulary
Assign descriptors to words
Find Similar vectors
Rank Results
MSER. SIFT
Hierarchical K MeansO(N log(# of leaves))
Search along the pathO(k log(# of leaves))
Each node has a Inverted File list
No Spatial Consistency
VocabBuilding
QueryStage
# of leaves = 1M
SIFT
K MeansO(Nk)
Linear SearchO(k)
Inverted Files
Loose Spatial Consistency
![Page 29: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/29.jpg)
Approximate k-means [Philbin et al, CVPR’07]
• HKM: 1. not the best NN 2. error propagation• Go back to flat vocabulary, but much faster• Nearest neighbor search is the bottleneck• Use kd-tree to speed up
![Page 30: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/30.jpg)
Kd tree
• k-d tree hierarchically decomposes the descriptor space
![Page 31: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/31.jpg)
Approximate k-means cont.
• Best bin first Search: O(log k)
dist
dist
Priority queue by dist
![Page 32: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/32.jpg)
AKM cont.
• Use multiple (L) randomized kd trees for Approximate NN search, in both construction and assignment phase
• Searching complexity: O(L*log K + C)
• Approximate K-means complexity: O(N log K)
Share one priority queue!
![Page 33: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/33.jpg)
Approximate k-means cont.
• Close to exact k-means, though much faster
• Superior to HKM empirically
Mean AP
![Page 34: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/34.jpg)
Approximate K-Means: SummaryFind Descriptors
Learn Vocabulary
Assign descriptors to words
Find Similar vectors
Rank Results
SIFT
Approximate K MeansO(N log(# of leaves))
SearchO(log(# of leaves))
Inverted File list
Transformation based Spatial Verification
VocabBuilding
QueryStage
# of leaves = 1M
![Page 35: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/35.jpg)
Low Recall
• Feature detection and quantization– Even for the same object,
different visual words!– Query region may not
contain enough features• Two possible solutions– Query expansion [Chum et al, ICCV’07]
– Soft Assignment [Philbin et al, CVPR’08]
NOISY!
![Page 36: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/36.jpg)
Query Expansion
• Text retrieval– Dimension is too high!
Query: violin ……
results Results about fiddle
……Search engine
Expanded query
Search engine
![Page 37: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/37.jpg)
Query expansion [Chum et al, ICCV’07]
• Basic idea: augment the query with visual words from initial matching region
Query initial result list
expanded query by new resultsaveraging the results
…
• What if the initial results are poor?– Filter by spatial constraints
![Page 38: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/38.jpg)
Query expansion cont.
• Results
Query Initial results Results by expansion
![Page 39: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/39.jpg)
Query expansion cont.
• Increasing recall without loss of precision
before after
Each line: each query image for a certain landmark
Precision-recall curves
![Page 40: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/40.jpg)
Query Expansion: SummaryFind Descriptors
Learn Vocabulary
Assign descriptors to words
Find Similar vectors
Rank Results
SIFT
Approximate K Means
Search
Inverted File list
Spatial Verification to find inliers for expansion
VocabBuilding
QueryStage
![Page 41: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/41.jpg)
Soft Assignment [Philbin et al, CVPR’08]
• Try to capture the information for the near-boundary features by associating one feature to several words
• Intuition: includes “semantically” similar variants in the context of text retrieval
• with denser image representation (thus more storage)
• Can be applied to existing methodstradeoff!
![Page 42: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/42.jpg)
Soft Assignment
• Associate a single descriptor with r nearby cluster centers instead of its single nearest-neighbor clusterweight
• Modified tfidf scheme– tf: use real values for frequency– idf: counting occurrence as one, empirically best
• Modified spatial verification– weighted score instead of occurrence to rank hypothesis
)2
exp( 2
2
d
![Page 43: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/43.jpg)
MatchingResults
![Page 44: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/44.jpg)
Soft assignment cont.
• Improvements
![Page 45: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/45.jpg)
Soft Assignments: SummaryFind Descriptors
Learn Vocabulary
Assign descriptors to words
Find Similar vectors
Rank Results
SIFT
Approximate K Means
Soft Assignment
Inverted File list
Spatial Verification reranking + query expansion
VocabBuilding
QueryStage
3 times storage when using 3-NN soft assignment
10% mAP gain
Tradeoff!
![Page 46: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/46.jpg)
Spatial Information Lost
• Quantization is information loss process– From 2d (pixel) structure to (feature)vector
• How to model the geometry?
![Page 47: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/47.jpg)
Spatial Consistency Constraints [Chum et al, ICCV’07]
T (3 dof)
T’ (6 dof)
scale1 scale2
![Page 48: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/48.jpg)
Conclusion
• Borrow text retrieval methods to conduct fast image retrieval, e.g. tf-idf weight, query expansion
• Quantization, searching and indexing are the core problems
Find Descriptors
Learn Vocabulary
Assign descriptors to words
Find Similar vectors
Rank Results
![Page 49: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/49.jpg)
Future Work
• Goal: Web-scale retrieval system
• Vocabulary to span the space of all images?
• Spatial information in Indexing instead of Ranking
![Page 50: Fast and Compact Retrieval Methods in Computer Vision](https://reader036.vdocuments.site/reader036/viewer/2022062814/56816853550346895dde628f/html5/thumbnails/50.jpg)
QA and comments?