gist of scene

8/9/2019 Gist of Scene

1/29

Gist: A Mobile RoboticsApplication of Context-BasedVision in Outdoor Environment

Christian Siagian

Laurent IttiUniv. Southern California, CA, USA


2/29

Outline Mobile robot localization

Biological approach to vision

Gist model

Testing and results

Discussion and conclusion


3/29

Mobile Robot Localization Where are we?

Localization=identifyinglandmarks


4/29

Mobile Robot Localization Indoors: strong assumptions of flat walls,

narrow hallways, and solid angles Ranging sensors (laser and sonar) for mapping

Outdoors: less conforming set of surfaces Ranging sensors are less effective, vision is better


5/29

Robot Vision Localization Object-based Vision Localization

Objects as landmarks


6/29

Robot Vision Localization Region-based Vision Localization

regions as landmarks


7/29


8/29

Gist Definition and background

Essence, holistic characteristics of an image

Context information obtained within a eyesaccade (app. 150 ms.)

Evidence of place recognizing cells atParahippocampal Place Area (PPA)

Biologically plausible models of Gist are yet to

be proposed Nature of tasks done with gist

Scene categorization/context recognition

Region priming/layout recognition

Resolution/scale selection


9/29

Human Vision

Architecture Visual Cortex:

Low level filters,center-surround, andnormalization

Saliency Model: Attend to pertinent

regions

Gist Model:

Compute imagegeneral characteristics

High Level Vision: Object recognition

Layout recognition

Scene understanding


10/29

Gist Model Utilize the same Visual Cortex raw

features in the saliency model [Itti 2001]

Gist is theoretically non-redundant withSaliency

Gist vs. Saliency Instead of looking at most conspicuous

locations in image, looks at scene as a whole

Detection of regularities, not irregularities

Cooperation (Accumulation) vs. competition(WTA) among locations

More spatial emphasis in saliency

Local vs. global/regional interaction


11/29

Gist Model

Implementation V1 Raw image feature-Maps

Orientation Channel

Gabor filters at 4 angles

(0,45,90,135) on 4 scales= 16 sub-channels

Color:

red-green and blue-yellowcenter surround each with6 scale combinations

= 12 sub-channels

Intensity

dark-bright center-surround with 6 scalecombinations

= 6 sub-channels

= Total of 34 sub-channels


12/29

Gist Model Implementation Gist Feature Extraction

Average values of predetermined grid


13/29

Gist Model

Implementation Dimension Reduction

Original:

34 sub-channels x

16 features

= 544 features

PCA/ICA reduction:

80 features

Kept >95% of variance


14/29

Gist Model

Implementation Dimension Reduction

Original:

34 sub-channels x

16 features

= 544 features

PCA/ICA reduction:

80 features

Kept >95% of variance Place Classification

Three-layer neuralnetworks


15/29

SystemExample

Run


16/29

Testing & Results Site selection:

Different challenges appearance-wise

Variability in area covered/ pathlengths

Various lighting conditions

Single-view filming

Clean break between segments

Scalability: combine all sites


17/29

Map of Experiment Sites


18/29

Site 1: Building Complex


19/29

Site 1 ExperimentInput Image Gist Feature-vectors

System Output PCA/ICA reduced features


20/29

Site 1 Results

Output Label

AssignedLabel


21/29

Site 2:Vegetation-filled Park


22/29

Site 2 Result

Output Label

AssignedLabel


23/29




24/29

Site 3: Open Field Park


25/29




26/29

Site 3 Result

Output Label

AssignedLabel


27/29

Combined Sites Result


28/29

Discussion & Conclusion Result of current model:

Success rate between 82.48% and 87.93%

Combined rate of 85.96%

4.73% error in inter-site classification

Integrating saliency for robot navigation Localization within segment

Identifying discriminating cues in the environment

Issues in object-based systems still applies

Bad view detection Foreground objects sometimes occlude whole view

Obstacle avoidance, exploration, etc.


29/29

gist of scene

Documents