networks and hierarchical processing: object recognition in...

43
Networks and Hierarchical Processing: Object Recognition in Human and Computer Vision Guest Lecture: Marius Cătălin Iordan CS 131 8 Computer Vision: Foundations and Applications 01 December 2014

Upload: trinhkien

Post on 09-Aug-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Networks and Hierarchical Processing: Object Recognition in Human and Computer Vision

Guest&Lecture:&Marius&Cătălin&Iordan&&CS&131&8&Computer&Vision:&Foundations&and&Applications&

01&December&2014

1. Processing Pathways in the Human Visual System• “what” and “where” pathways • building features in the ventral stream

2. Hierarchical Pattern Recognition Systems

• early stages: small scale neural network • injecting neuroscience knowledge into design

3. Third Age of the Neural Network: Modern Deep Nets

• extremely large scale ( data + computation ) • sudden, huge performance boost for recognition

2Roadmap 3

Processing Pathways review 7

world

retina (compression)

LGN

visual cortex (expansion)

From Retina to Cortex

V1

3Processing Pathways review 5

The Flow of Information

Weiner & Grill-Spector (2012)

ITIT

3Processing Pathways review 6Van Essen (1991)Retina

V1

3Processing Pathways “what” and “where” pathways 7

Specialization: “What” and “Where” Pathways

Mishkin & Ungerleider 1982

monkey lesion studies

3Processing Pathways “what” and “where” pathways 7

Specialization: “What” and “Where” Pathways

Mishkin & Ungerleider 1982

monkey lesion studieslesion “where” pathway: difficulty in spatial reasoning lesion “what” pathway: difficulty in object recognition

3Processing Pathways “what” and “where” pathways 7

Specialization: “What” and “Where” Pathways

Mishkin & Ungerleider 1982

monkey lesion studieslesion “where” pathway: difficulty in spatial reasoning lesion “what” pathway: difficulty in object recognition

3Processing Pathways building features in the ventral stream 8

Object Recognition: The “What” Pathway

DiCarlo & Cox (2007)

ITV1

3Processing Pathways building features in the ventral stream 9

Object Recognition: The “What” Pathway

Freeman & Simoncelli (2011), Tanaka (1997)

3Processing Pathways key points 10

Object Recognition: Building Features and Invariance

DiCarlo & Cox (2007)

invariance is built gradually across many successive stepseach area performs a transformation on its inputs

visual processing is done in stages

3Hierarchical Computation 11

2. Hierarchical Pattern Recognition Systems

neuroscience-inspired computer vision

3Hierarchical Computation neocognitron 12

Neocognitron: Neural Network

Layer 1

Layer 2

Layer 3Cell

3Hierarchical Computation neocognitron 13

Neocognitron: Neural Network

Fukushima (1988)

3Hierarchical Computation neocognitron 13

Neocognitron: Neural Network

Fukushima (1988)

input layer

S-cell layer

C-cell layer

3Hierarchical Computation neocognitron 14

Neocognitron: S-Cell

Fukushima (1988)

SUM THRESHOLD

C-cell layer / input layer C-cell layerS-cell layer

LEARNING

3Hierarchical Computation neocognitron 15

Neocognitron: S-Cell

figure courtesy of A. Alahi

3Hierarchical Computation neocognitron 16

Neocognitron: C-Cell “Pooling”

Fukushima (1988)

building position invariance

3Hierarchical Computation neocognitron 17

Neocognitron: Network

Fukushima (1988)

input S-cell C-cell S-cell C-cell S-cell C-cell S-cell C-cell

3Hierarchical Computation neocognitron 17

Neocognitron: Network

Fukushima (1988)

input S-cell C-cell S-cell C-cell S-cell C-cell S-cell C-cell

3Hierarchical Computation neocognitron 17

Neocognitron: Network

Fukushima (1988)

input S-cell C-cell S-cell C-cell S-cell C-cell S-cell C-cell

3Hierarchical Computation neocognitron 17

Neocognitron: Network

Fukushima (1988)

input S-cell C-cell S-cell C-cell S-cell C-cell S-cell C-cell

3Hierarchical Computation neocognitron 18

Neocognitron: Robust Results

Fukushima (1988)

input layer

S-cell layer

C-cell layer

3Hierarchical Computation neocognitron key points 19

Neocognitron

invariance is built gradually across many successive stepssimple neural network solves complicated non-linear problem

biologically inspired hierarchical processing pipeline

Fukushima (1988)

3Hierarchical Computation feed-forward model 20

Feed-Forward Object Recognition Model

Hubel & Wiesel (1962), Riesenhuber & Poggio (1999), Serre et al. (2007)

3Hierarchical Computation feed-forward model 20

Feed-Forward Object Recognition Model

Hubel & Wiesel (1962), Riesenhuber & Poggio (1999), Serre et al. (2007)

3Hierarchical Computation feed-forward model 21

Feed-Forward Object Recognition Model

task: is there an animal in the picture?

Serre et al. (2007)

3Hierarchical Computation feed-forward model 22

Feed-Forward Object Recognition Model

task: is there an animal in the picture?

Serre et al. (2007)

3Hierarchical Computation feed-forward model 22

Feed-Forward Object Recognition Model

task: is there an animal in the picture?

Serre et al. (2007)

3Hierarchical Computation feed-forward model 24

Feed-Forward Object Recognition Model

Serre et al. (2007)

task: is there an animal in the picture?

3Hierarchical Computation feed-forward model key points 25

Feed-Forward Object Recognition Model

Serre et al. (2007)

biologically-inspired processing pipeline

patches — receptive fields building invariance

hierarchical processing

major drawback?no feedback

Discussion

Discussion

how closely should we aim to copy human vision?

is reverse-engineering human vision a self-imposed limitation?

perfect recognition vs. visual understanding?

3Modern Neural Networks 27

3. Modern Neural Networks

extremely large scale data and computation

3Modern Neural Networks deep CNN 28

Deep Convolutional Neural Network

Krizhevsky et al. (2013)

650,000 cells — 60,000,000 parameters

input convolution pooling fully-connectedconvolution pooling…

3Modern Neural Networks deep CNN 28

Deep Convolutional Neural Network

Krizhevsky et al. (2013)

650,000 cells — 60,000,000 parameters

input convolution pooling fully-connectedconvolution pooling…

3Modern Neural Networks CS 231N 29

Deep Convolutional Neural Network

Want to know more about state-of-the-art neural networks?

CS 231N http://vision.stanford.edu/cs231n/

Winter QT 2015 Fei-Fei Li & Andrej Karpathy

3Processing Pathways key points 30

Object Recognition: Building Features and Invariance

DiCarlo & Cox (2007)

invariance is built gradually across many successive stepseach area performs a transformation on its inputs

visual processing is done in stages

3Hierarchical Computation feed-forward model key points 31

Feed-Forward Object Recognition Model

Serre et al. (2007)

biologically-inspired processing pipeline

patches — receptive fields building invariance

hierarchical processing

major drawbacks?no feedback

much less complex than human vision

3Modern Neural Networks deep CNN key points 32

Deep Convolutional Neural Network

Krizhevsky et al. (2013)

extremely large scale data and computation

sudden, huge performance boost for recognition

if you want to know more, take CS 231N in Winter QT

End-Quarter Feedback Forms