computational vision jitendra malik university of california at berkeley jitendra malik university...

25
Computational Vision Jitendra Malik University of California at Berkeley

Post on 21-Dec-2015

222 views

Category:

Documents


1 download

TRANSCRIPT

Computational VisionComputational Vision

Jitendra Malik

University of California at Berkeley

Jitendra Malik

University of California at Berkeley

Taxonomy of Vision ProblemsTaxonomy of Vision Problems

• Reconstruction:– estimate parameters of external 3D world.

• Visual Control:– visually guided locomotion and manipulation.

• Segmentation:– partition I(x,y,t) into subsets of separate objects.

• Recognition:– classes: face vs. non-face,

– activities: gesture, expression.

• Reconstruction:– estimate parameters of external 3D world.

• Visual Control:– visually guided locomotion and manipulation.

• Segmentation:– partition I(x,y,t) into subsets of separate objects.

• Recognition:– classes: face vs. non-face,

– activities: gesture, expression.

ReconstructionReconstruction

• Computer graphics is the forward problem: given scene geometry, reflectances and lighting, synthesize an image.

• Computer vision must address the inverse problem: given an image/multiple images, reconstruct the scene geometry, reflectacnes and illumination.

• Computer graphics is the forward problem: given scene geometry, reflectances and lighting, synthesize an image.

• Computer vision must address the inverse problem: given an image/multiple images, reconstruct the scene geometry, reflectacnes and illumination.

Recovering geometryRecovering geometry

• Historical roots in photogrammetry and analysis of 3D cues in human vision

• Single images adequate given knowledge of object class

• Multiple images make the problem easier, but not trivial as corresponding points must be identified.

• Historical roots in photogrammetry and analysis of 3D cues in human vision

• Single images adequate given knowledge of object class

• Multiple images make the problem easier, but not trivial as corresponding points must be identified.

Arc de Triomphe

Arc de Triomphe

Taj MahalTaj Mahalmodeled frommodeled from

one photographone photographby G. Borshukovby G. Borshukov

Recovered Campus ModelRecovered Campus Model

Campanile + 40 Buildings (Debevec et al)Campanile + 40 Buildings (Debevec et al)

Inverse Global Illumination (Yu et al)Inverse Global Illumination (Yu et al)

Reflectance Properties

Radiance Maps

Geometry Light Sources

Real vs. Synthetic Real vs. Synthetic

Real vs. Synthetic Real vs. Synthetic

Challenges in ReconstructionChallenges in Reconstruction

• Finding correspondences automatically

• Optimal estimation of structure from n views under perspective projection

• Models of reflectance and texture for natural materials and objects

• Finding correspondences automatically

• Optimal estimation of structure from n views under perspective projection

• Models of reflectance and texture for natural materials and objects

ControlControl

• Visual feedback signal for control of manipulation tasks such as grasping, moving and assembly

• Visual feedback for guiding locomotion– Obstacle avoidance for a moving robot– Lateral and longitudinal control of driving

• Visual feedback signal for control of manipulation tasks such as grasping, moving and assembly

• Visual feedback for guiding locomotion– Obstacle avoidance for a moving robot– Lateral and longitudinal control of driving

Challenges in controlChallenges in control

• Delay in feedback loop due to visual processing

• Hierarchies in sensory motor control– Open loop or closed loop– Discrete planning or continuous control

• Delay in feedback loop due to visual processing

• Hierarchies in sensory motor control– Open loop or closed loop– Discrete planning or continuous control

Image SegmentationImage Segmentation

Boundaries of image regions defined by a number of attributes

Boundaries of image regions defined by a number of attributes

– Brightness/color

– Texture

– Motion

– Stereoscopic depth

– Familiar configuration

– Brightness/color

– Texture

– Motion

– Stereoscopic depth

– Familiar configuration

Approaches Approaches

• Fitting a piecewise smooth surface to the image e.g. Mumford and Shah

• Probabilistic Inference using Markov Random Field model of image e.g. Geman and Geman

• Graph partitioning using spectral techniques e.g. Shi and Malik

• Fitting a piecewise smooth surface to the image e.g. Mumford and Shah

• Probabilistic Inference using Markov Random Field model of image e.g. Geman and Geman

• Graph partitioning using spectral techniques e.g. Shi and Malik

Image Segmentation as Graph PartitioningImage Segmentation as Graph PartitioningBuild a weighted graph G=(V,E) from image

V: image pixels

E: connections between pairs of nearby pixels

region

same the tobelong

j& iy that probabilit :ijW

Partition graph so that similarity within group is large and similarity between groups is small -- Normalized Cuts [Shi&Malik 97]

Temporal Segmentation: Tracking

Challenges in SegmentationChallenges in Segmentation

• Interaction of multiple cues

• Local measurements to global percepts

• Interplay of image-driven and object model driven processing

• Interaction of multiple cues

• Local measurements to global percepts

• Interplay of image-driven and object model driven processing

RecognitionRecognition

• Possible for both instances or object classes (Mona Lisa vs. faces or Beetle vs. cars)

• Tolerant to changes in pose and illumination, and occlusion

measurement animationrecognition

Recognition of Gait and Gesture

run

Challenges in recognitionChallenges in recognition

• Unified framework for segmentation and recognition

• Representing shape variability in a category

• Interplay of discriminative vs generative models

• Unified framework for segmentation and recognition

• Representing shape variability in a category

• Interplay of discriminative vs generative models

Core disciplinesCore disciplines

• Geometry – Differential geometry– Projective geometry

• Probability and Statistics– Reconstruction = estimation– Control = decision theory– Segmentation = clustering– Recognition = classification

• Geometry – Differential geometry– Projective geometry

• Probability and Statistics– Reconstruction = estimation– Control = decision theory– Segmentation = clustering– Recognition = classification