cap5415-computer vision lecture 21-deformable part model...

CAP5415-ComputerVisionLecture21-DeformablePartModel(DPM)

GuestLecturer:Sarfaraz Husein

1

Motivation• Objectcategorydetection:

– Detectallobjectswiththesamecategoryinanimage• Forexamplehorsedetection:

2

ClassificationTask

3

Atleastonebus

100%

DetectionTask

4

100%

Predictedboundingboxshouldoverlapbyatleast50%withgroundtruth!!!

Detections“nearmisses”

5

PascalVOCChallenge-TheObjectClasses

6

Imagesretrieved fromflickerwebsite.

SuccessfulDetectionMethod

• Jointwinnerin2009PascalVOCchallengewiththeOxfordMethod.

• Awardof"lifetimeachievement“in2010.• Mixtureofdeformablepartmodels.• Eachcomponenthasglobaltemplate+deformableparts– HOGfeaturetemplates.

• Fullytrainedfromboundingboxesalone.

(P.Felzenszwalbetal.)

7

Recap:DeformableModels

8

Challenge:handlingvariationofappearanceswithinobjectclassesnon-rigidobjects

Idea:Considerobjectsasadeformedversionofatemplate!

WhatisDPM?• Deformable Part is a discriminatively trained,

multi-scale model for image training that aim at making possible the effective use of more latent information such as hierarchical (grammar) models and models involving latent three dimensional pose.

9

DPM• Definition:

– Root : Catch Roughly appearance of object– Part : Catch local appearance of object– Spring : spatial connections between parts

• Displacement :– Using minimizing energy function to find the optimal

displacement[1] Pictorial Structures for Object Recognition, Daniel P. Huttenlocher,1973

10

Model

11

(b1) coarse template (b2)part templates (b3) spatial model (a) person detection Example

The deformable model includes both a coarse global template covering an entire object and higher resolution part templates. The templates represent histogram of gradient features

ModelingSpatialRelationswithDPM

12

Spring-based models

[Fischler & Elschlager ,’73], [Ramanan et al,’07], [Felszwenwalb et al,’05,’09], [Kumar et al, ‘09]

Parts Based Model

Vertices – Local Appearance

Edges - Spatial Relationship

JJ

JL

LL

Goal: Assign model parts to image regions preserving both local appearance and spatial

relationships

MatchingProblem

14

MatchingProblem

15

Parts based models - Inference Problem

Inference problem: What is the best scoring assignment f?

Local Appearance term Pairwise Spatial Relationship term

Inference is NP-hard for general graphs

For trees can use belief propagation for exact solution in polytime

Parts based models - Learning Problem

Linear models:

s.t.

Local Appearance term Pairwise Spatial Relationship term

Convex max-margin objective

Positive examples on one side

Negative examples on the other

[Kumar et al,’09]

Learning linear models: Find weight vectors that best separate positive and negative examples. E.g.,

FindingMotorbike

18

HumanTracking

19

HumanPoseEstimation

20

HoG:HistogramofGradients

21

HistogramofGradient1.Compute gradient every pixel2.Group 8*8 pixels into a cell, and 4*4 cells into a

block.Build histogram of each cells about 9 orientation

bins (00~1800)

22

HistogramofGradient3.A block contain 4*9 feature vector for local

information, and a bounding box which contain n’s blocks has n*4*9 feature vector

4.Training by SVM

23

Filters

24

ObjectHypothesis

25

Training

26

LearnedModels

27

Training Models• Latent SVM is used for learning the training images

Experiment Result• PASCAL VOC 2006,2007,2008 comp3 challenge datasets• Some statistics:

– It takes 2 seconds to evaluate a model in one image(4952 images in the test dataset)

– It takes 4 hours to train a model– MUCH faster than most systems.– All of the experiments were done on a 2.8Ghz 8-core Intel Xeon Mac

Pro computer running Mac OS X 10.5.

Experiment Result• Measurement: predicted bounding box is correct if it overlaps

more than 50 percent with ground truth bounding box; otherwise, considered false positive

Experiment Result

Experiment Result• Best Average Precision score in 9 out of 20, second in 8

Experiment Result

SliceCreditsandReferences• JonathanHuang,TomaszMalisiewicz,2009.• P.Felzenszwalb,D.McAllester,D.Ramanan.ADiscriminatively Trained,

Multiscale,DeformablePartModel.IEEEConferenceonComputerVisionandPatternRecognition(CVPR),2008

• P.Felzenszwalb,R.Girshick,D.McAllester,D.Ramanan.ObjectDetectionwithDiscriminatively TrainedPartBasedModels. IEEETransactionsonPatternAnalysis andMachineIntelligence,Vol.32,No.9,Sep.2010.

• LevelImageRepresentationforSceneClassificationandSemanticFeatureSparsification.ProceedingsoftheNeuralInformationProcessingSystems(NIPS),2010.

35

cap5415-computer vision lecture 21-deformable part model...

Documents