efficient part-based recognition of multiple object classes
DESCRIPTION
Efficient Part-Based Recognition of Multiple Object Classes. Object Class Recognition Systems. Model Representation. 2. Object Class Recognition Systems. Model Representation Learning Algorithm. 3. Object Class Recognition Systems. Model Representation Learning Algorithm - PowerPoint PPT PresentationTRANSCRIPT
Object Class Recognition Systems
• Model Representation
• Learning Algorithm
• Recognition Algorithm
4
Object Class Recognition Systems
• Model Representation– Part-based model
• Learning Algorithm
• Recognition Algorithm
5
Object Class Recognition Systems
• Model Representation– Part-based model
• Part appearance
• Learning Algorithm
• Recognition Algorithm
6
Object Class Recognition Systems
• Model Representation– Part-based model
• Part appearance• Part position
• Learning Algorithm
• Recognition Algorithm
7
Object Class Recognition Systems
• Model Representation– Part-based model
• Part appearance• Part position
• Learning Algorithm
• Recognition Algorithm
8
History of Part-BasedObject Recognition
TemplateMatching
Part-Based
ConstellationModel
k-FansProposedAlgorithm
Fergus et al. 2003, 2005
Crandall et al. 2005, 2006
Burl et al. 1998
Bag of Parts
Hoffman 1999
9
Template Matching
• Efficient? (N = # pixels) – Yes, O(N)
• Robust?
30
Templ Part
Const
k-Fan this
BoP
Template Matching
• Efficient? (N = # pixels) – Yes, O(N)
• Robust?
31
Templ Part
Const
k-Fan this
BoP
Template Matching
• Efficient? (N = # pixels) – Yes, O(N)
• Robust?
32
Templ Part
Const
k-Fan this
BoP
Template Matching
• Efficient? (N = # pixels) – Yes, O(N)
• Robust?
33
Templ Part
Const
k-Fan this
BoP
Template Matching
• Efficient? (N = # pixels) – Yes, O(N)
• Robust?
34
Templ Part
Const
k-Fan this
BoP
Template Matching
• Efficient? (N = # pixels) – Yes, O(N)
• Robust?
35
Templ Part
Const
k-Fan this
BoP
Template Matching
• Efficient? (N = # pixels) – Yes, O(N)
• Robust? – No, inflexible
36
Templ Part
Const
k-Fan this
BoP
Part-Based
• Part appearance (template matching)
• Part location (fully connected)
44
Templ Part
Const
k-Fan this
BoP
Part-Based
• Part appearance (template matching)
• Part location (fully connected)• Efficient? (P = #parts, N = #pixels)
53
Templ Part
Const
k-Fan this
BoP
Part-Based
• Part appearance (template matching)
• Part location (fully connected)• Efficient? (P = #parts, N = #pixels)
54
Templ Part
Const
k-Fan this
BoP
P
Part-Based
• Part appearance (template matching)
• Part location (fully connected)• Efficient? (P = #parts, N = #pixels)
55
Templ Part
Const
k-Fan this
BoP
P
N
Part-Based
• Part appearance (template matching)
• Part location (fully connected)• Efficient? (P = #parts, N = #pixels)
56
Templ Part
Const
k-Fan this
BoP
P
N
…
Part-Based
• Part appearance (template matching)
• Part location (fully connected)• Efficient? (P = #parts, N = #pixels)
57
Templ Part
Const
k-Fan this
BoP
P
N
N…
Part-Based
• Part appearance (template matching)
• Part location (fully connected)• Efficient? (P = #parts, N = #pixels) – No, O(PN^P)
58
Templ Part
Const
k-Fan this
BoP
P
N
N…
Part-Based
• Part appearance (template matching)
• Part location (fully connected)• Efficient? (P = #parts, N = #pixels) – No, O(PN^P)• Robust?
59
Templ Part
Const
k-Fan this
BoP
P
N
N…
Part-Based
• Part appearance (template matching)
• Part location (fully connected)• Efficient? (P = #parts, N = #pixels) – No, O(PN^P)• Robust? Yes!
60
Templ Part
Const
k-Fan this
BoP
P
N
N…
Part-Based
• Part appearance (template matching)
• Part location (fully connected)• Efficient? (P = #parts, N = #pixels) – No, O(PN^P)• Robust? Yes!• All following algorithms try to approach the accuracy of
this method, while gaining efficiency
61
Templ Part
Const
k-Fan this
BoP
P
N
N…
Constellation Model
Trade-off
Sparse Interest Points (n<<N) Dense Image (n=N)
RobustEfficient
63
Templ Part
Const
k-Fan this
BoP
Constellation Model
• Efficient O(Pn^P) (n = # interest points, P = # parts)
64
Templ Part
Const
k-Fan this
BoP
Constellation Model
• Efficient O(Pn^P) (n = # interest points, P = # parts)
• Approximation:– Sparse image: only consider interest points
(n<<N)
65
Templ Part
Const
k-Fan this
BoP
Constellation Model
• Efficient O(Pn^P) (n = # interest points, P = # parts)
• Approximation:– Sparse image: only consider interest points
(n<<N)
• Interest point detector too general– regions of the image discarded without
considering particular parts that may be there
66
Templ Part
Const
k-Fan this
BoP
Bag of Parts
• Part appearance (template matching)• Part location (ignored, disconnected)
71
Templ Part
Const
k-Fan this
BoP
Bag of Parts
• Part appearance (template matching)
• Part location (ignored, disconnected)
max
max
Templ Part
Const
k-Fan this
BoP
78
Bag of Parts
• Part appearance (template matching)• Part location (ignored, disconnected)
• Efficient? (P = #parts, N = #pixels)
79
Templ Part
Const
k-Fan this
BoP
Bag of Parts
• Part appearance (template matching)• Part location (ignored, disconnected)
• Efficient? (P = #parts, N = #pixels)
80
Templ Part
Const
k-Fan this
BoPN
Bag of Parts
• Part appearance (template matching)• Part location (ignored, disconnected)
• Efficient? (P = #parts, N = #pixels)
81
Templ Part
Const
k-Fan this
BoPN
P
Bag of Parts
• Part appearance (template matching)• Part location (ignored, disconnected)
• Efficient? (P = #parts, N = #pixels) – Yes, O(NP)
82
Templ Part
Const
k-Fan this
BoPN
P
Bag of Parts
• Part appearance (template matching)• Part location (ignored, disconnected)
• Efficient? (P = #parts, N = #pixels) – Yes, O(NP)
83
Templ Part
Const
k-Fan this
BoP
• Robust?
Bag of Parts
• Part appearance (template matching)• Part location (ignored, disconnected)
• Efficient? (P = #parts, N = #pixels) – Yes, O(NP)
84
Templ Part
Const
k-Fan this
BoP
• Robust?
Bag of Parts
• Part appearance (template matching)• Part location (ignored, disconnected)
• Efficient? (P = #parts, N = #pixels) – Yes, O(NP)
85
Templ Part
Const
k-Fan this
BoP
• Robust?– No localization
Bag of Parts
• Part appearance (template matching)• Part location (ignored, disconnected)
• Efficient? (P = #parts, N = #pixels) – Yes, O(NP)
86
Templ Part
Const
k-Fan this
BoP
• Robust?– No localization
Bag of Parts
• Part appearance (template matching)• Part location (ignored, disconnected)
• Efficient? (P = #parts, N = #pixels) – Yes, O(NP)
87
Templ Part
Const
k-Fan this
BoP
• Robust?– No localization
Bag of Parts
• Part appearance (template matching)• Part location (ignored, disconnected)
• Efficient? (P = #parts, N = #pixels) – Yes, O(NP)
88
Templ Part
Const
k-Fan this
BoP
• Robust?– No localization
– More likely to find false detections
Bag of Parts
• Part appearance (template matching)• Part location (ignored, disconnected)
• Efficient? (P = #parts, N = #pixels) – Yes, O(NP)
89
Templ Part
Const
k-Fan this
BoP
• Robust?– No localization
– More likely to find false detections
– …but still in common use
• Efficient? (N = # pixels, P = # parts) Yes, O(PN^2)
1-Fans
133
Templ Part
Const
k-Fan this
BoP
N
P
N
• Efficient? (N = # pixels, P = # parts) Yes, O(PN^2)– Use Dynamic Programming:
1-Fans
134
Templ Part
Const
k-Fan this
BoP
N
P
N
• Efficient? (N = # pixels, P = # parts) Yes, O(PN^2)– Use Dynamic Programming:
• Precompute DTj image in O(N)
1-Fans
135
Templ Part
Const
k-Fan this
BoP
N
P
N
• Efficient? (N = # pixels, P = # parts) Yes, O(PN^2)– Use Dynamic Programming:
• Precompute DTj image in O(N)
• Then add the A1 and DTj images
1-Fans
136
Templ Part
Const
k-Fan this
BoP
N
P
N
• Efficient? (N = # pixels, P = # parts) Yes, O(PN^2) O(NP) – Same as bag of parts!– Use Dynamic Programming:
• Precompute DTj image in O(N)
• Then add the A1 and DTj images
1-Fans
137
Templ Part
Const
k-Fan this
BoP
N
P
N
• Efficient? (N = # pixels, P = # parts) Yes, O(PN^2) O(NP) – Same as bag of parts!
• Robust?
1-Fans
138
Templ Part
Const
k-Fan this
BoP
• Efficient? (N = # pixels, P = # parts) Yes, O(PN^2) O(NP) – Same as bag of parts!
• Robust? – Yes, better than bag of parts.
1-Fans
139
Templ Part
Const
k-Fan this
BoP
Aren’t We Done?
• Object Recognition is efficient and robust. Can’t we stop here?
• What about…– Detecting multiple object classes? (M = #
objects… think 30,000) O(MNP)
Recap: History of Part-BasedObject Recognition
TemplateMatching
Part-Based
ConstellationModel
k-FansProposedAlgorithm
Fergus et al. 2003, 2005
Crandall et al. 2005, 2006
Burl et al. 1998
Bag of Parts
Hoffman 1999
143
AccurateEfficient
Proposed Algorithm
1-fan
Sparse Appearance Image(thresholded)
Does not rely on ageneralinterest point detector
144
Templ Part
Const
k-Fan this
BoP
Proposed Algorithm
• How do we threshold appearances in sublinear time (i.e. < O(MNP)? M = # objects, N = # pixels, P = # parts/object)
Templ Part
Const
k-Fan this
BoP
Proposed Algorithm
• How do we threshold appearances in sublinear time (i.e. < O(MNP)? M = # objects, N = # pixels, P = # parts/object)
• View thresholded appearance detection as an R-nearest neighbor problem
Templ Part
Const
k-Fan this
BoP
R-Nearest Neighbors
• What is the set of points in a database that are within a radius R from the query point q?
Templ Part
Const
k-Fan this
BoP
q
R
R-Nearest Neighbors
• What is the set of points in a database that are within a radius R from the query point q?
Templ Part
Const
k-Fan this
BoP
q
R-Nearest Neighbors
• What is the set of points in a database that are within a radius R from the query point q?
Templ Part
Const
k-Fan this
BoP
q
• The space part appearances (high dimensions)
• Assume part appearances are identically distributed spherical Gaussians (i Σi =cI)
R-Nearest Neighbors
q
• The space part appearances (high dimensions)
• Assume part appearances are identically distributed spherical Gaussians (i Σi =cI)
• Database points are the means (μi)
R-Nearest Neighbors
q
• The space part appearances (high dimensions)
• Assume part appearances are identically distributed spherical Gaussians (i Σi =cI)
• Database points are the means (μi)
• 1-Fan part appearances can be expressed this way
R-Nearest Neighbors
q
• Okay, how fast can it be solved?– In low dimensions (<8), kD trees solve it in sublinear time
R-Nearest Neighbors
q
• Okay, how fast can it be solved?– In low dimensions (<8), kD trees solve it in sublinear time
– But the conjectured “curse of dimensionality” prevents it from being solved efficiently for high dimensions
R-Nearest Neighbors
q
• Okay, how fast can it be solved?– In low dimensions (<8), kD trees solve it in sublinear time
– But the conjectured “curse of dimensionality” prevents it from being solved efficiently for high dimensions
• Locality Sensitive Hashing solves the problem approximately,– misses some points with probability 1-δ
– Solves it in O(nd1/c+o(1))
– Trades off probability of false negative for efficiency
R-Nearest Neighbors