roles of appearance and contextual information
DESCRIPTION
Roles of Appearance and Contextual Information. Devi Parikh , Larry Zitnick and Tsuhan Chen. When?. High intra-class variance Chairs Low inter-class variance Lemon vs. tennis ball Occlusion Low resolution image Image of a far away scene Bad quality. [Bustos et.al., CSUR 2005]. - PowerPoint PPT PresentationTRANSCRIPT
Roles of Appearance and Contextual Information
Devi Parikh, Larry Zitnick and Tsuhan Chen
When?• High intra-class variance– Chairs
• Low inter-class variance– Lemon vs. tennis ball
• Occlusion
• Low resolution image– Image of a far away scene– Bad quality
[Rabinovich et al., ICCV 07]
[Bustos et.al., CSUR 2005]
Popular motivations for context…
[Antonio Torralba]
Popular motivations for context…
[Antonio Torralba]
Popular motivations for context…
[Antonio Torralba]
Popular motivations for context…
[Antonio Torralba]
However…
Let’s look at it more carefully…
Object appearance
Appearance + Context
Blind recognition
Low resolution
High resolution
Blind recognition
Appearance + Context
When?
• Humans?
• Machines?
Set-up
Low Resolution Appearance Machine
HumanContext
??
? ?
?
High Resolution
Low resolution: Appearance
Low resolution: Context
Low resolution: Appearance + Context
High resolution: Appearance
High resolution: Context
High resolution: Appearance + Context
11 subjects
2 sessions
3 scenarios
70 segments
Related work
[Torralba et al., Tech Report 2007]
Machines
N
jijiij
N
iii ccc
Z 1,1
,1)|P( Sc
??
? ?
?
AppearanceContext
Texture + Shape
[TextonBoost]*
Color
[GMM]
Neural Network
Co-occurrence**
Relative location
Relative scale
Inference:
Belief Propagation
*[Shotton et al., ECCV 2006] **[Rabinovich et al., ICCV 2007]
Machines
Co-occurrence** Relative location Relative scale
**[Rabinovich 2007]
Machines• MSRC dataset• Corel dataset
[Felzenshwalb and Huttenlocher IJCV 2004]
Results
Results
A: Appearance alone
C: Context alone
A+C: Appearance and context
Low High High resolution images do not benefit from
context
Results
A: Appearance alone
C: Context alone
A+C: Appearance and context
Low High
Low resolution images NEED context
Results
Results (Machine)
Context & Appearance help each other
Results (Machine)
Appearance hurts
Results (Machine)
Context very weak
Results (Machine)
Context makes no difference
Results (Machine)
Results (Machine)
Relative Location
Co-occurrence
Relative Scale
Results (Machine)• Failure cases
Results (Machine)
MSRC Corel
Existing (high) 75 1 81 2
Proposed (high)
91 93
Proposed (low) 83 86
1 [Yang et al., CVPR 2007] 2 [He et al., ECCV 2006]
Results (Machine)
Contributions
• Context is most useful when appearance information is weak
• Low resolution images are an appropriate venue for studying context
Discussion• Improve appearance or context models to achieve human
performance?
• Need to improve both appearance and context models• In low resolution images, appearance information is similar for
humans and machines– Hence, appropriate venue for studying context
• Achieving human performances need not be the ultimate goal
Need to improve context models
Need to improve appearance models
Follow up work (PAMI 2011)
Results
A: Appearance alone
C: Context alone
A+C: Appearance and context
Low High
Machine do not leverage contextual
information as effectively as humans
Are machine missing a source of context?
Different Sources of Context
Different Sources of Context: None
Different Sources of Context: Cooc
Different Sources of Context: Rel-scale
Different Sources of Context: Rel-loc
Different Sources of Context: All
Different Sources of Context: All
Different Sources of Context: Blind
Different Sources of Context: Image
High Resolution Appearance
PASCAL
“Natural” scenes
Bounding boxes
Common pixels
More void
Results
MSRC PASCAL50556065707580859095 app
co-occrel-locrel-scaleall-explodedall-no-voidblindall
Results
MSRC PASCAL50556065707580859095 app
co-occrel-locrel-scaleall-explodedall-no-voidblindall
Co-occurrence information helps in both datasets. Relative location does not help in PASCAL.
MSRC Location Statistics
Building
TreeSh
eep
AeroplaneFa
ce
Bicycle Sig
nBook
RoadDog
Boat0
102030405060708090
100
BuildingGrass
TreeCow
SheepSky
AeroplaneWater
FaceCar
BicylceFlower
SignBird
BookChairRoad
CatDog
BodyBoat
PASCAL Location Statistics
AerplaneBird
Bottle CarChair
Dining table
HorsePerso
nSh
eepTrain
0102030405060708090
100
AeroplaneBicycle
BirdBoat
BottleBusCarCat
ChairCow
Dining tableDog
HorseMotorbike
PersonPotted plant
SheepSofa
TrainTV / monitor
Results
MSRC PASCAL50556065707580859095 app
co-occrel-locrel-scaleall-explodedall-no-voidblindall
Relative scale information does not help across the board.
Results
MSRC PASCAL50556065707580859095 app
co-occrel-locrel-scaleall-explodedall-no-voidblindall
Relative scale information does not help across the board.
Results
MSRC PASCAL50556065707580859095 app
co-occrel-locrel-scaleall-explodedall-no-voidblindall
Our choice of visualization does not affect performance in MSRC.
Results
MSRC PASCAL50556065707580859095 app
co-occrel-locrel-scaleall-explodedall-no-voidblindall
There is information in the “void” (unlabeled) pixels!
We leverage this cue for object detectionCongcong Li, Devi Parikh and Tsuhan Chen.
Extracting Adaptive Contextual Cues from Unlabeled Regions, ICCV 2011
MUCH more interesting analysis and findings in the PAMI 2011 paper:
Exploring Tiny Images: The Roles of Appearance and Contextual Information for Machine and Human Object Recognition