computational vision: object recognition object recognition jeremy wyatt
Post on 19-Dec-2015
240 Views
Preview:
TRANSCRIPT
Computational Vision: Object Recognition
Object Recognition
Jeremy Wyatt
Computational Vision: Object Recognition
Plan
David Marr: the model based approach to vision
Model based approaches: Geons, Model Fitting
Appearance based approaches: PCA, SIFT, implicit shape model
Psychological Evidence: View dependent vs. view independent recognition
Summary: who is right?
Computational Vision: Object Recognition
Model based vision David Marr was a brilliant young British vision researcher
who defined a coherent approach to the study of vision during the 1970s
According to one tradition coming out of Marr’s work:
• Vision is process of reconstructing the 3d scene from 2d information
• The vision system has representations of 3d geometric structures
• Visual pipeline
• So selecting models and recovering their parameters from image data is a key task in vision
Intensity image
Primal sketch
Model selection
2.5d sketch
Computational Vision: Object Recognition
Model based vision
There is an infinite variety of objects. How do we represent, store and access models of them efficiently?
One suggestion was the use of a small library of 3d parts from which many complex models can be constructed
There are many schemes: generalised cylinders, Geons, Superquadrics
Vision researchers set about applying them
Computational Vision: Object Recognition
Models vs Appearances
But they didn’t work very well …
By the early 1990s people were experimenting with statistical techniques, e.g. PCA
These learn a statistical summary of the appearance of each view of an object
Appearance Model
Computational Vision: Object Recognition
Appearance based recognition: SIFT
These statistical approaches characterise some aspects of the appearance of an object that can be used to recognise it
But this means they are (largely) view dependent, you have to learn a different statistical model for each different view
e.g. SIFT based recognition (David Lowe, UBC)
• Find interest points in the scale space• Re-describe the interest points so that
they are robust to: Image translation, scaling, rotation Partially invariant to illumination
changes, affine and 3d projection changes
Computational Vision: Object Recognition
Category level recognition (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Category level recognition (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Category level recognition (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Constellation model (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Constellation Model (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Implicit Shape Model (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Implicit Shape Model (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Implicit Shape Model (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Implicit Shape Model (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Implicit Shape Model (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Implicit Shape Model (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Implicit Shape Model (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Implicit Shape Model (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Implicit Shape Model (Thanks to Bastian Liebe)
Computational Vision: Object Recognition
Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts
Aleš Leonardis and Sanja Fidler
University of LjubljanaFaculty of Computer and Information Science
Visual Cognitive Systems Laboratory
Reproduced with permission
Computational Vision: Object Recognition
Framework
Main properties of the framework:Main properties of the framework:
• Computational plausibility Computational plausibility
Hierarchical representationHierarchical representation
CCompositionalityompositionality ( (parts composed of partsparts composed of parts))
IIndexing & matchingndexing & matching recognition scheme recognition scheme
• Statistics driven learning (unsupervised learning)Statistics driven learning (unsupervised learning)
• Fast, incremental (continuous) learningFast, incremental (continuous) learning
Computational Vision: Object Recognition
Recognition: Indexing and matching
image
car motorcycle dog person
hypotheses
verification
Gradually limiting the searchGradually limiting the search
LEARNLEARN
Computational Vision: Object Recognition
Overview of the architecture
Starts with simple, local features and Starts with simple, local features and learnslearns more more and more complex and more complex compositionscompositions
Learns layer after layerLearns layer after layer to exploit the regularities to exploit the regularities in natural images as efficiently and compactly as in natural images as efficiently and compactly as possiblepossible
Builds computationally feasible layers of parts by Builds computationally feasible layers of parts by selecting only the most selecting only the most statistically significant statistically significant compositions of specific granularitycompositions of specific granularity
Learns Learns lower layers in a category independent lower layers in a category independent wayway (to obtain optimally sharable parts) and (to obtain optimally sharable parts) and category specific higher layerscategory specific higher layers which contain which contain only a small number of highly generalizable parts only a small number of highly generalizable parts for each categoryfor each category
New categories can efficiently and continuously be New categories can efficiently and continuously be added to the representation without the need to added to the representation without the need to restructure the complete hierarchyrestructure the complete hierarchy
Implements parts in a robust, Implements parts in a robust, layeredlayered interplay of interplay of indexing & matchingindexing & matching
Computational Vision: Object Recognition
Part based appearance recognition (Fidler & Leonardis 07)
Computational Vision: Object Recognition
Learned hierarchy for faces and cars (first three layers are the same; links show compositionality for each of the categories; spatial variability of parts is not shown)
Results
Computational Vision: Object Recognition
Part based appearance recognition (Fidler & Leonardis 07)
Computational Vision: Object Recognition
Results - Detections
Computational Vision: Object Recognition
Results - Specific categories, faces
Detection of Layer5 parts
Computational Vision: Object Recognition
Results - Specific categories, faces
Computational Vision: Object Recognition
Evidence from biology
Is human object recognition view dependent?
Shepherd & Miller
Pinker & Tarr
There is a quite a large body of experimental data that supports the view dependent camp.
Appearance based approaches fit neatly with this camp.
Computational Vision: Object Recognition
Summary
This is not a resolved debate
There is evidence for both sides
Structural 3d information is almost certainly extracted by the brain too
Model based: how do we extract good enough low level features (e.g. a depth map)?
Appearance based: only seems to be good for recognition, which is a small part of the vision problem.
top related