introduction to object recognition cs773c machine intelligence advanced applications spring 2008:...

54
Introduction to Object Recognition CS773C Machine Intelligence Advanced CS773C Machine Intelligence Advanced Applications Applications Spring 2008: Object Recognition Spring 2008: Object Recognition

Post on 21-Dec-2015

245 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Introduction to Object Recognition

CS773C Machine Intelligence Advanced ApplicationsCS773C Machine Intelligence Advanced Applications

Spring 2008: Object RecognitionSpring 2008: Object Recognition

Page 2: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Outline

• The Problem of Object Recognition

• Approaches to Object Recognition

• Requirements and Performance Criteria

• Representation Schemes

• Matching Schemes

• Example Systems

• Indexing

• Grouping

• Error Analysis

Page 3: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Problem Statement

• Given some knowledge of how certain objects may appear and an image of a scene possibly containing those objects, report which objects are present in the scene and where.

Recognition should be: (1) invariant to view point changes and object transformations

(2) robust to noise and occlusions

Page 4: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Challenges

• The appearance of an object can have a large range of variation due to:– photometric effects

– scene clutter

– changes in shape (e.g.,non-rigid objects)

– viewpoint changes

• Different views of the same object can give rise to widely different images !!

Page 5: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Object Recognition Applications

• Quality control and assembly in industrial plants.

• Robot localization and navigation.

• Monitoring and surveillance.

• Automatic exploration of image databases.

Page 6: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Human Visual Recognition

• A spontaneous, natural activity for humans and other biological systems.– People know about tens of thousands of different

objects, yet they can easily distinguish among them.– People can recognize objects with movable parts or

objects that are not rigid.– People can balance the information provided by

different kinds of visual input.

Page 7: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Why Is It Difficult?

• Hard mathematical problems in understanding the relationship between geometric shapes and their projections into images.

• We must match an image to one of a huge number of possible objects, in any of an infinite number of possible positions (computational complexity)

Page 8: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Why Is It Difficult? (cont’d)

• We do not understand the recognition problem

Page 9: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

What do we do in practice?

• Impose constraints to simplify the problem.

• Construct useful machines rather than modeling human performance.

Page 10: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Approaches Differ According To:

• Knowledge they employ– Model-based approach (i.e., based on explicit model of

the object's shape or appearance)

– Context-based approach (i.e., based on the context in which objects may be found)

– Function-based approach (i.e., based on the function for which objects may serve)

Page 11: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Approaches Differ According To: (cont’d)

• Restrictions on the form of the objects– 2D or 3D objects

– Simple vs complex objects

– Rigid vs deforming objects

• Representation schemes– Object-centered

– Viewer-centered

Page 12: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Approaches Differ According To: (cont’d)

• Matching scheme– Geometry-based

– Appearance-based

• Image formation model– Perspective projection

– Affine transformation (e.g., planar objects)

– Orthographic projection + scale

Page 13: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Requirements

• Viewpoint Invariant– Translation, Rotation, Scale

• Robust– Noise (i.e., sensor noise)– Local errors in early processing modules (e.g., edge

detection)– Illumination/Shadows– Partial occlusion (i.e., self and from other objects)– Intrinsic shape distortions (i.e., non-rigid objects)

Page 14: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Performance Criteria

• Scope– What kind of objects can be recognized and in what

kinds of scenes ?

• Robustness– Does the method tolerate reasonable amounts of noise

and occlusion in the scene ?

– Does it degrade gracefully as those tolerances are exceeded ?

Page 15: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Performance Criteria (cont’d)

• Efficiency– How much time and memory are required to search the

solution space ?

• Accuracy– Correct recognition

– False positives (wrong recognitions)

– False negatives (missed recognitions)

Page 16: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Representation Schemes

(1) Object-centered

(2) Viewer-centered

Page 17: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Object-centered Representation

• Associates a coordinate system with the object

• The object geometry is expressed in this frame

Advantage: every view of the object is available

Disadvantage: might not be easy to build (i.e., reconstruct 3D from 2D).

Page 18: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Object-centered Representation (cont’d)

• Two different matching approaches:

(1) Derive a similar object-centered description from the scene and match it with the models (e.g. using “shape from X” methods).

(2) Apply a model of the image formation process on the candidate model to back-project it onto the scene (camera calibration required).

Page 19: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Viewer-centered Representation

• Objects are described by a set of characteristic views or aspects

Advantages: (i) Easier to build compared to object-centered,(ii) matching is easier since it involves 2D descriptions.

Disadvantages: Requires a large number of views.

Page 20: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Predicting New Views

• There is some evidence that the human visual system uses a “viewer-centered” representation for object recognition.

• It predicts the appearance of objects in images obtained under novel conditions by generalizing from familiar images of the objects.

Page 21: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Predicting New Views (cont’d)

Familiar Views

Predict Novel View

Page 22: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Matching Schemes

(2) Appearance-based

(1) Geometry-based

explore correspondences between model and scene features

represent objects from all possible viewpoints and all possible illumination directions.

Page 23: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Geometry-based Matching

• Advantage: efficient in “segmenting” the object of interest from the scene and robust in handling “occlusion”

• Disadvantage: rely heavily on feature extraction and their performance degrades when imaging conditions give rise to poor segmentations.

Page 24: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Appearance-based Matching

• Advantage: circumvent the feature extraction problem by enumerating many possible object appearances in advance.

• Disadvantages: (i) difficulties with segmenting the objects from the background and dealing with occlusions, (ii) too many possible appearances, (iii) how to sample the space of appearances ?

Page 25: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Model-Based Object Recognition

• The environment is rather constraint and recognition relies upon the existence of a set of predefined objects.

Page 26: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Goals of Matching

• Identify a group of features from an unknown scene which approximately match a set of features from a known view of a model object.

• Recover the geometric transformation that the model object has undergone

Page 27: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Transformation Space

• 2D objects (2 translation, 1 rotation, 1 scale)

• 3D objects, perspective projection (3 rotation, 3 translation)

• 3D objects, orthographic projection + scale (essentially 5 parameters and a constant for depth)

Page 28: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Matching: Two Steps

• Hypothesis generation: the identities of one or more models are hypothesized.

• Hypothesis verification: tests are performed to check if a given hypothesis is correct or not.

Models

Page 29: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Hypothesis Generation-Verification Example

Page 30: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Efficient Hypothesis Generation

• How to choose the scene groups?– Do we need to consider every possible group?– How to find groups of features that are likely to belong to the

same object? – Use “grouping” schemes

• Database organization and searching– Do we need to search the whole database of models?– How should we organize the model database to allow for fast and

efficient storage and retrieval?– Use “indexing” schemes

Page 31: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Interpretation Trees(E. Grimson and T. Lozano-Perez, 1987)

• Nodes of the tree represent match pairs (i.e., scene to model feature match).

• Each level of the tree represents all possible matches between an image feature fi and a model feature mj

• The tree represents the complete search space.

Page 32: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Interpretation Trees (cont’d)(E. Grimson and T. Lozano-Perez, 1987)

• Interpretation: a path through the tree.

(Model features: m1, m2, m3, m4)

(Scene features: f1, f2)

• Use a “Depth-first-tree search” to find a match (or interpretation).

Page 33: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Interpretation Trees (cont’d)(E. Grimson and T. Lozano-Perez, 1987)

• Search space is very large (i.e., exponential number of matches).

• Find consistent interpretations without exploring all possible ways of matching image and model features.

• Use geometric constraints to “prune” the tree:Unary constraints: properties of individual features (e.g., length/orientation of a line)Binary constraints: properties of pairs of features (e.g., distance/angle between two

lines)

Page 34: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Alignment Approach(Huttenlocher and Ullman, 1990)

• Most approaches searched for the largest pairing of model and image features for which there exist a single geometric transformation mapping each model feature to its corresponding image feature.

• The alignment approach seeks to recover the geometric transformation between the model and the scene using a minimum number of correspondences.

Page 35: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Alignment Approach (cont’d)(Huttenlocher and Ullman, 1990)

• Weak perspective model (3 correspondences - O(m3n3) cases):

x’ = Π(sRx+b)

– Π: orthographic projection

– s: scale

– R: 3D rotation

– b: translation• Equivalent to an affine transformation (valid when object is far

from camera and object depth small relative to distance from camera)

x’=Lx+b

Page 36: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Pose Clustering(e.g., Thompson and Mundy, 1987, Ballard, 1981)

• Main idea:

If there is a transformation that can bring into alignment a large number of features, then this transformation will receive a large number of votes.

Page 37: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Pose Clustering(e.g., Thompson and Mundy, 1987, Ballard, 1981)

• Main Steps(1) Quantize the space of possible transformations (usually 4D - 6D).

(2) For each hypothetical match, solve for the transformation that aligns the matched features.

(3) Cast a vote in the corresponding transformation space bin.

(4) Find "peak" in transformation space.

Page 38: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Pose Clustering (example)(e.g., Thompson and Mundy, 1987, Ballard, 1981)

Page 39: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Appearance-based Recognition(e.g., Murase and Nayar, 1995, Turk and Petland, 1991)

• Represent an object by the set of its possible appearances (i.e., under all possible viewpoints and illumination conditions).

• Identifying an object implies finding the closest stored image.

Page 40: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Appearance-based Recognition(e.g., Murase and Nayar, 1995, Turk and Petland, 1991)

• In practice, a subset of all possible appearances is used.

• Images are highly correlated, so “compress” them into a low-dimensional space that captures key appearance characteristics (e.g., use Principal Component Analysis (PCA)).

Page 41: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Indexing-based Recognition

• Preprocessing step: groups of model features are used to index the database and the indexed locations are filled with entries containing references to the model objects and information that later can be used for pose recovering.

• Recognition step: groups of scene features are used to index the database and the model objects listed in the indexed locations are collected into a list of candidate models (hypotheses).

Page 42: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Indexing-Based Recognition (cont’d)

• Use a-priori stored information about the models to quickly eliminate non-feasible matches during recognition.

Page 43: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Invariants

• Properties that do not change with object transformations or viewpoint changes.

• Ideally, we would like the index computed from a group of model features to be invariant.

• Only one entry per group needs to be stored this way.

Page 44: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Planar (2D) objects

• The index is computed based on invariant properties.

• One entry per group needs to be stored in this case.

affine invariants(geometric hashing)Lamdan et al., 1988

Page 45: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Geometric Hashing

Page 46: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Three-Dimensional Objects

• No general-case invariants exist for single views of general 3D objects (Clemens & Jacobs, 1991).

• Special case and model-based invariants (Rothwell et al., 1995, Weinshall, 1993)

Page 47: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Indexing for 3D Object Recognition (cont’d)

• One approach might be ...

Page 48: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Indexing for 3D Object Recognition (cont’d)

• Another approach might be ...

Page 49: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Grouping

• Grouping is the process that organizes the image into parts, each likely to come from a single object.

• It reduces the number of hypotheses dramatically.

• Non-accidental properties (grouping clues)– Orientation, Collinearity, Parallelism, Proximity

Convex groups(Jacobs, 1996)

Page 50: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Error Analysis

• Uncertainty in feature locations– It is important to analyze the sensitivity of each algorithm with

respect to uncertainty in the location of the image features.

• Case of Indexing– Analyze how errors in the locations of the points affects the

invariants.

Page 51: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

Error Analysis (cont’d)

Page 52: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

References

• E. Grimson and T. Lozano-Perez, "Localizing overlapping parts by searching the interpretation tree", IEEE Pattern Analysis and Machine Intelligence, vol. 9, no. 4, pp. 469-482, July 1987.

• D. Huttenlocher and S. Ullman, "Recognizing solid objects by alignment with an image", International Journal of Computer Vision, vol. 5, no. 2, pp. 195-212, 1990.

• Y. Lamdan, J. Schwartz, and H. Wolfson, "Affine invariant model-based object recognition", IEEE Trans. on Robotics and Automation, vol. 6, no. 5, pp. 578-589, October 1990.

• Rigoutsos I. & Hummel R., "A Bayesian approach to model matching with geometric hashing", CVGIP: Image Understanding, 62, 11-26, 1995.

Page 53: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

References (cont’d)

• D. Clemens and D. Jacobs, "Space and time bounds on indexing 3D models from 2D images", IEEE Pattern Analysis and Machine Intelligence, vol. 13 no. 10, pp. 1007-1017, 1991.

• D. Thompson and J. Mundy, "Three dimensional model matching from an unconstrained viewpoint", IEEE Conference on Robotics and Automation, pp. 208-220, 1987.

• D. Ballard, "Generalizing the hough transform to detect arbitrary patterns", Pattern Recognition, vol. 13, no. 2, pp. 111-122, 1981.

• H. Murase and S. Nayar, "Visual learning and recognition of 3D objects from appearance", International Journal of Computer Vision, vol. 14, pp. 5-24, 1995.

Page 54: Introduction to Object Recognition CS773C Machine Intelligence Advanced Applications Spring 2008: Object Recognition

References (cont’d)

• M. Turk and A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive Neuroscience, Vol. 3, pp. 71-86, 1991.

• D. Jacobs, "Robust and efficient detection of salient convex groups", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 18, no. 1, pp. 23-37, 1996.

• Bowyer and C. Dyer, "Aspect graphs: an introduction and survey of recent results", International Journal of Imaging Systems and Technology, vol. 2, pp. 315-328, 1990.