combining efficient object localization and image classification
DESCRIPTION
Combining efficient object localization and image classification. H. Harzallah, F. Jurie and C. Schmid LEAR, INRIA Grenoble, LJK. Tasks. Image classification: assigning labels to the image. Car: present Cow: present Bike: not present Horse: not present …. Cow. Car. Tasks. - PowerPoint PPT PresentationTRANSCRIPT
Combining efficient object localization and image classification
H. Harzallah, F. Jurie and C. Schmid
LEAR, INRIA Grenoble, LJK
• Image classification: assigning labels to the image
Tasks
Car: presentCow: presentBike: not presentHorse: not present…
• Image classification: assigning labels to the image
Tasks
Car: presentCow: presentBike: not presentHorse: not present…
• Object localization: define the location and the category
Car CowLocatio
n
Category
Contributions
• Object class localization method
• Combining image classification and object localization
Localization++ Classification-- Localization-- Classification++
Overview
• Related work and datasets
• Efficient object localization– Experimental results
• Combining image classification and localization– Experimental results
• Conclusion
Related work
• Object localization– Sliding window [Dalal06] [Rowley95]– Implicit shape model [Leibe04]– SVM classifiers [Chum07] [Ferrari08]– Cascade of classifiers [Viola01] [Vedaldi09]
• Context information– Combination of context sources [Divvala09]– Graphical model of events in images [Li07]– Local segmentation + global classification [Shotton08] [Heitz08]
PASCAL VOC dataset
• PASCAL VOC dataset 2007 and 2008
• Two tasks : classification and localization
• Fixed train/test set-up for the 20 object classes
• Standard evaluation measure – Area of overlap as detection matching criterion– Average precision for performance evaluation
Overview
• Related work and datasets
• Efficient object localization– Experimental results
• Combining image classification and localization– Experimental results
• Conclusion
Efficient object localization
• Sliding window based approach
• Image representation• Combination of features• Extensive parameters evaluation
Robust image representation
• Efficient search strategy
Image representation
• Combination of 2 image representations
• Histogram Oriented Gradient– Gradient based features– Integral Histograms
• Bag of Features– SIFT features extracted densely + k-means clustering– Pyramidal representation of the sliding windows– One histogram per tile Histogram
Histogram
Histogram
Histogram
Histogram Histogram
Efficient search strategy
• Reduce search complexity– Sliding windows: huge number of candidate windows– Cascades: pros/cons
• Two stage cascade:– Filtering classifier with a linear SVM
• Low computational cost• Evaluation: capacity of rejecting negative windows
– Scoring classifier with a non-linear SVM• Χ2 kernel with a channel combination [Zhang07]• Significant increase of performance
Efficiency of the 2 stage localization• Performance w. resp. to nbr of windows selected by the linear SVM
(mAP on Pascal 2007)
• Sliding windows: 100k candidate windows• A small number of windows are enough after filtering
Localization performance: aeroplane
Method AP
X2, HOG+BOF 33.8
X2, BOF 29.8
X2, HOG 18.4
Linear, HOG 10.0
Localization performance: car
Method AP
X2, HOG+BOF 50.4
X2, BOF 42.3
X2, HOG 47.5
Linear, HOG 33.9
Localization performance
• Mean Average Precision on all 20 classes
• PASCAL 2007 dataset
Method mAP
Linear, HOG 14.6
Linear, BOF 15.0
Linear, HOG+BOF 17.6
X2, HOG 21.9
X2, BOF 23.1
X2, HOG+BOF 26.3
Localization examples: correct localizations
Car
Sofa
Bicycle
Horse
Localization examples: false positives
Car
Sofa
Bicycle
Horse
Localization examples: missed objects
Car
Sofa
Bicycle
Horse
Overview
• Related work and datasets
• Efficient object localization– Experimental results
• Combining image classification and localization– Experimental results
• Conclusion
• Image classification & localization use a different information
Combination: key points
• For many TP only one has a high score• Truncated objects: hard for the
detector• Small objects: ok for the detector but
not for the classifier using global information
• Input: classification ( Si ) and localization ( Sw ) scores
• Output: probability that object is present
• Suppose that classification and localization outputs are independent:
Combination model
• For each modality (classification/detection): notion of detectability P(Di) for classifier and P(Dw) for detector
• Encodes the ability to detect presence of the objects
• Assuming that the classifier/detector outputs conditional probabilities: P(O|Di,Si) and P(O|Dw,Sw)
Combination model
• P (O |Si) = P(Di) × P(O|Si, Di) + P(¬Di) × P(O|Si,¬Di)
• P (O |Sw) = P(Dw) × P(O|Sw, Dw) + P(¬Dw) × P(O|Si,¬Dw)
• Final probability:
• Handle both cases:– Object detectable by two modalities– Object detectable by only one modality
Combination model
• P(O|¬Di,Si) and P(O|¬Di,Si) : constant value
• Sw = classification by localization: highest localization score
• Priors P(Di) and P(Dw) class dependant
Combination model
Combination experimental setup
• Image classifier : INRIA_flat classifier– SVM classifier Χ2 kernel using multiple feature channels [Zhang07]– Excellent results in PASCAL 2008 challenge
• Detector : as described previously
• Experimental validation on PASCAL VOC 2007
• Comparison to the state of the art on PASCAL VOC 2008
Experimental results : gain obtained
• Classification
• Localization0
2
4
6
8
1 0
1 2
B o t t le P la n t T V C a r C o w
G a i n i n A P
0
1
2
3
4
5
6
7
8
C o w S h e e p B o t t le S o f a C a t
G a i n i n A P
Method mAP
Base Classifier 60.1
Our Combination 63.5
Method mAP
Base Detector 26.3
Our Combination 28.9
• Correct but low score localization• High classification score score increased after combination
Experimental results
Car localization
• High classification score• No localization score decreased after combination
Experimental results
Car classification
• Based on blind evaluation on PASCAL VOC 2008• Classification
– Best on 12 classes out of 20
• Localization
– Best on 11 classes out of 20
Comparison to the state of the art
Method mAP
Lear_flat 53.8
Lear_shotgun 54.5
SurreyUvA_SRKDA 54.9
UvA_TreeSFS 54.3
Our method (based on Lear_flat) 57.7
Method mAP
CASIA_Det 12.7
MPI_struct 10.4
UoCTTIUCI 22.8
Our method 22.7
Conclusion
• Efficient localization method
• Successful combination of classification and localization
• State of the art performance on both tasks
Thank you