lecture 18: context and segmentation
DESCRIPTION
CS6670: Computer Vision. Noah Snavely. Lecture 18: Context and segmentation. Announcements. Project 3 out soon Final project. When is context helpful?. Information. Contextual features. Local features. Distance. Slide by Antonio Torralba. Is it just for small / blurry things?. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/1.jpg)
Lecture 18: Context and segmentation
CS6670: Computer VisionNoah Snavely
![Page 2: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/2.jpg)
Announcements
• Project 3 out soon• Final project
![Page 3: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/3.jpg)
When is context helpful?
Distance
Information
Local features
Contextual features
Slide by Antonio Torralba
![Page 4: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/4.jpg)
Is it just for small / blurry things?
Slide by Antonio Torralba
![Page 5: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/5.jpg)
Is it just for small / blurry things?
Slide by Antonio Torralba
![Page 6: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/6.jpg)
Is it just for small / blurry things?
Slide by Antonio Torralba
![Page 7: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/7.jpg)
Context is hard to fight!
Thanks to Paul Violafor showing me these
![Page 8: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/8.jpg)
more “Look-alikes”
![Page 9: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/9.jpg)
1
Don’t even need to see the object
Slide by Antonio Torralba
![Page 10: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/10.jpg)
Don’t even need to see the object
Chance ~ 1/30000 Slide by Antonio Torralba
![Page 11: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/11.jpg)
The influence of an object extends beyond its physical boundaries
But object can say a lot about the scene
![Page 12: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/12.jpg)
Object priming
Torralba, Sinha, Oliva, VSS 2001
Increasing contextual information
![Page 13: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/13.jpg)
Object priming
Torralba, Sinha, Oliva, VSS 2001
![Page 14: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/14.jpg)
Why context is important?
• Typical answer: to “guess” small / blurry objects based on a prior• most current vision systems
![Page 15: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/15.jpg)
So, you think it’s so simple?
What is this?
slide by Takeo Kanade
![Page 16: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/16.jpg)
Why is this a car?
![Page 17: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/17.jpg)
…because it’s on the road!
Why is this road?
![Page 18: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/18.jpg)
Why is this a road?
![Page 19: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/19.jpg)
Same problem in real scenes
![Page 20: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/20.jpg)
Why context is important?
• Typical answer: to “guess” small / blurry objects based on a prior• Most current vision systems
• Deeper answer: to make sense of the visual world• much work yet to be done!
![Page 21: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/21.jpg)
Why context is important?
• To resolve ambiguity• Even high-res objects can be ambiguous
– e.g. there are more people than faces in the world!
• There are 30,000+ objects, but only a few dozen can happen within a given scene
• To notice “unusual” things• Prevents mental overload
• To infer function of unknown object
![Page 22: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/22.jpg)
Sources of Context
from Divvala et al, CVPR 2009
![Page 23: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/23.jpg)
Semantic Context
Berg et al. 2004 Fei-Fei and Perona 2005
Gupta & Davis 2008 Rabinovich et al. 2007
![Page 24: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/24.jpg)
Local Pixel Context
Wolf & Bileschi 2006 Dalal & Triggs 2005
![Page 25: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/25.jpg)
Scene Gist Context
Global (low-dimensional) scene statistics
![Page 26: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/26.jpg)
Geometric Context
Geometric Context [Hoiem et al. ’2005]
Ground
Sky
Frontal LeftRight
![Page 27: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/27.jpg)
Photogrammetric Context
Hoiem et al 2006
Lalonde et al. 2008
![Page 28: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/28.jpg)
Illumination and Weather Context
Illumination context (Lalonde et al)
![Page 29: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/29.jpg)
Geographic Context
Hays & Efros 2008
![Page 30: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/30.jpg)
Temporal Context
Gallagher et al 2008
Liu et al. 2008
![Page 31: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/31.jpg)
Cultural Context
Photographer Bias
Society Bias
![Page 32: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/32.jpg)
Flickr Paris
![Page 33: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/33.jpg)
“uniformly sampled” Paris
![Page 34: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/34.jpg)
…or Notre Dame
![Page 35: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/35.jpg)
Why Photographers are Biased?
People want their pictures to be recognizable and/or interesting
vs.
![Page 36: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/36.jpg)
Why Photographers are Biased?
People follow photographic conventions
vs.
Simon & Seitz 2008
![Page 37: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/37.jpg)
“100 Special Moments” by Jason Salavon
![Page 38: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/38.jpg)
Empirical Evaluation of Context Divvala, Hoiem, Hays, Efros, Hebert,
CVPR 2009
Empirical Evaluation of Context Divvala, Hoiem, Hays, Efros, Hebert,
CVPR 2009
![Page 39: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/39.jpg)
Change in AccuraciesChange in Accuracies
Type Change
Small 97%
Large 15%
Occluded 79%
Non-Occluded -5%
Difficult 57%
Total 47%
• Percentage change in average precisions before and after including context information
• Context helps in case of impoverished appearance
![Page 40: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/40.jpg)
Detecting difficult objects
Office Maybethere is a mouse
Start recognizing the scene
Torralba, Murphy, Freeman. NIPS 2004.
![Page 41: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/41.jpg)
Detecting difficult objects
Detect first simple objects (reliable detectors) that provide strongcontextual constraints to the target (screen -> keyboard -> mouse)
Torralba, Murphy, Freeman. NIPS 2004.
![Page 42: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/42.jpg)
Detecting difficult objects (object parts)
Detect first simple objects (reliable detectors) that provide strongcontextual constraints to the target (screen -> keyboard -> mouse)
Torralba, Murphy, Freeman. NIPS 2004.
![Page 43: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/43.jpg)
…on the other hand
![Page 44: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/44.jpg)
Who needs context anyway?We can recognize objects even out of context
BanksySlide by Antonio Torralba
![Page 45: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/45.jpg)
Lecture 18a: Segmentation
CS6670: Computer VisionNoah Snavely
From Sandlot Science
![Page 46: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/46.jpg)
From images to objects
What defines an object?• Subjective problem, but has been well-studied• Gestalt Laws seek to formalize this
– proximity, similarity, continuation, closure, common fate
– see notes by Steve Joordens, U. Toronto
![Page 47: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/47.jpg)
Extracting objects
How could we do this automatically (or at least semi-automatically)?
![Page 48: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/48.jpg)
The Gestalt school• Grouping is key to visual perception• Elements in a collection can have properties that result
from relationships • “The whole is greater than the sum of its parts”
subjective contours occlusion
familiar configuration
http://en.wikipedia.org/wiki/Gestalt_psychology Slide from S.Lazebnik
![Page 49: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/49.jpg)
The ultimate Gestalt?
Slide from S.Lazebnik
![Page 50: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/50.jpg)
Gestalt factors
• These factors make intuitive sense, but are very difficult to translate into algorithms
Slide from S.Lazebnik
![Page 51: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/51.jpg)
Semi-automatic binary segmentation
![Page 52: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/52.jpg)
Intelligent Scissors (demo)
![Page 53: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/53.jpg)
Intelligent Scissors [Mortensen 95]• Approach answers a basic question
– Q: how to find a path from seed to mouse that follows object boundary as closely as possible?
![Page 54: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/54.jpg)
GrabCutGrabcut [Rother et al., SIGGRAPH 2004]
![Page 55: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/55.jpg)
Is user-input required?Our visual system is proof that automatic methods are
possible• classical image segmentation methods are automatic
Argument for user-directed methods?• only user knows desired scale/object of interest
![Page 56: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/56.jpg)
q
Automatic graph cut [Shi & Malik]
Fully-connected graph• node for every pixel• link between every pair of pixels, p,q
• cost cpq for each link
– cpq measures similarity
» similarity is inversely proportional to difference in color and position
p
Cpq
c
![Page 57: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/57.jpg)
Segmentation by Graph Cuts
Break Graph into Segments• Delete links that cross between segments• Easiest to break links that have low cost (similarity)
– similar pixels should be in the same segments
– dissimilar pixels should be in different segments
w
A B C
![Page 58: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/58.jpg)
Cuts in a graph
Link Cut• set of links whose removal makes a graph disconnected• cost of a cut:
A B
Find minimum cut• gives you a segmentation
![Page 59: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/59.jpg)
But min cut is not always the best cut...
![Page 60: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/60.jpg)
Cuts in a graph
A B
Normalized Cut• a cut penalizes large segments• fix by normalizing for size of segments
• volume(A) = sum of costs of all edges that touch A
![Page 61: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/61.jpg)
Interpretation as a Dynamical System
Treat the links as springs and shake the system• elasticity proportional to cost• vibration “modes” correspond to segments
– can compute these by solving an eigenvector problem– http://www.cis.upenn.edu/~jshi/papers/pami_ncut.pdf
![Page 62: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/62.jpg)
Interpretation as a Dynamical System
Treat the links as springs and shake the system• elasticity proportional to cost• vibration “modes” correspond to segments
– can compute these by solving an eigenvector problem– http://www.cis.upenn.edu/~jshi/papers/pami_ncut.pdf
![Page 63: Lecture 18: Context and segmentation](https://reader035.vdocuments.site/reader035/viewer/2022070405/56814023550346895dab8238/html5/thumbnails/63.jpg)
Color Image Segmentation