cap5415-computer vision lecture 21-deformable part model...
TRANSCRIPT
CAP5415-ComputerVisionLecture21-DeformablePartModel(DPM)
GuestLecturer:Sarfaraz Husein
1
Motivation• Objectcategorydetection:
– Detectallobjectswiththesamecategoryinanimage• Forexamplehorsedetection:
2
ClassificationTask
3
Atleastonebus
100%
DetectionTask
4
100%
Predictedboundingboxshouldoverlapbyatleast50%withgroundtruth!!!
Detections“nearmisses”
5
PascalVOCChallenge-TheObjectClasses
6
Imagesretrieved fromflickerwebsite.
SuccessfulDetectionMethod
• Jointwinnerin2009PascalVOCchallengewiththeOxfordMethod.
• Awardof"lifetimeachievement“in2010.• Mixtureofdeformablepartmodels.• Eachcomponenthasglobaltemplate+deformableparts– HOGfeaturetemplates.
• Fullytrainedfromboundingboxesalone.
(P.Felzenszwalbetal.)
7
Recap:DeformableModels
8
Challenge:handlingvariationofappearanceswithinobjectclassesnon-rigidobjects
Idea:Considerobjectsasadeformedversionofatemplate!
WhatisDPM?• Deformable Part is a discriminatively trained,
multi-scale model for image training that aim at making possible the effective use of more latent information such as hierarchical (grammar) models and models involving latent three dimensional pose.
9
DPM• Definition:
– Root : Catch Roughly appearance of object– Part : Catch local appearance of object– Spring : spatial connections between parts
• Displacement :– Using minimizing energy function to find the optimal
displacement[1] Pictorial Structures for Object Recognition, Daniel P. Huttenlocher,1973
10
Model
11
(b1) coarse template (b2)part templates (b3) spatial model (a) person detection Example
The deformable model includes both a coarse global template covering an entire object and higher resolution part templates. The templates represent histogram of gradient features
ModelingSpatialRelationswithDPM
12
Spring-based models
[Fischler & Elschlager ,’73], [Ramanan et al,’07], [Felszwenwalb et al,’05,’09], [Kumar et al, ‘09]
Parts Based Model
Vertices – Local Appearance
Edges - Spatial Relationship
JJ
JL
LL
Goal: Assign model parts to image regions preserving both local appearance and spatial
relationships
MatchingProblem
14
MatchingProblem
15
Parts based models - Inference Problem
Inference problem: What is the best scoring assignment f?
Local Appearance term Pairwise Spatial Relationship term
Inference is NP-hard for general graphs
For trees can use belief propagation for exact solution in polytime
Parts based models - Learning Problem
Linear models:
s.t.
Local Appearance term Pairwise Spatial Relationship term
Convex max-margin objective
Positive examples on one side
Negative examples on the other
[Kumar et al,’09]
Learning linear models: Find weight vectors that best separate positive and negative examples. E.g.,
FindingMotorbike
18
HumanTracking
19
HumanPoseEstimation
20
HoG:HistogramofGradients
21
HistogramofGradient1.Compute gradient every pixel2.Group 8*8 pixels into a cell, and 4*4 cells into a
block.Build histogram of each cells about 9 orientation
bins (00~1800)
22
HistogramofGradient3.A block contain 4*9 feature vector for local
information, and a bounding box which contain n’s blocks has n*4*9 feature vector
4.Training by SVM
23
Filters
24
ObjectHypothesis
25
Training
26
LearnedModels
27
Training Models• Latent SVM is used for learning the training images
Experiment Result• PASCAL VOC 2006,2007,2008 comp3 challenge datasets• Some statistics:
– It takes 2 seconds to evaluate a model in one image(4952 images in the test dataset)
– It takes 4 hours to train a model– MUCH faster than most systems.– All of the experiments were done on a 2.8Ghz 8-core Intel Xeon Mac
Pro computer running Mac OS X 10.5.
Experiment Result• Measurement: predicted bounding box is correct if it overlaps
more than 50 percent with ground truth bounding box; otherwise, considered false positive
Experiment Result
Experiment Result
Experiment Result• Best Average Precision score in 9 out of 20, second in 8
Experiment Result
SliceCreditsandReferences• JonathanHuang,TomaszMalisiewicz,2009.• P.Felzenszwalb,D.McAllester,D.Ramanan.ADiscriminatively Trained,
Multiscale,DeformablePartModel.IEEEConferenceonComputerVisionandPatternRecognition(CVPR),2008
• P.Felzenszwalb,R.Girshick,D.McAllester,D.Ramanan.ObjectDetectionwithDiscriminatively TrainedPartBasedModels. IEEETransactionsonPatternAnalysis andMachineIntelligence,Vol.32,No.9,Sep.2010.
• LevelImageRepresentationforSceneClassificationandSemanticFeatureSparsification.ProceedingsoftheNeuralInformationProcessingSystems(NIPS),2010.
35