qingqun kong 2011.7 - iavision.ia.ac.cn/zh/senimar/reports/visnet.pdf · visnet a model of...

Qingqun Kong

2011.7.12

Visnet A model of invariant object recognition

Edmund T. Rolls and Gustavo Deco, ”Computational Neuroscience of Vision”, Oxford University Press,2002

Visnet A model of invariant object representation

Hierarchical network

提纲物体识别简介

物体识别的生理机制

物体识别的方法

Visnet

Visnet的实现过程及结果分析

下一步的工作

Invariant object recognition

model

Outputs

Inputs


Visnet

Outputs

InputsImages of different

0bjects at different positions


model

Outputs

Inputs

labels


model

Outputs

Inputs

labels

Images of different 0bjects at

different positions


model

Outputs

Inputs

Invariant objectRepresentation

labels

Invariant object recognition Solving translation(view、size…) invariance:

responding the same local spatial arrangement ,ignoring the global position of the object

Recognizing the object in different transforms in just a few seconds of inspection of an object


物体表示的生理机制

物体表示的方法

Visnet


下一步的工作

Neurophysiological mechanisms Hierarchical network

Feed forward connection


Feed forward connection

Lateral connection


Sparse representation

Local representation

distributed representation


Sparse representation

Local representation

distributed representation

Representing similarity by vector correlation;

Exponential coding capacity;


Sparse coding

Temporal properties

When a object was translated to a nearby position, because this would occur in a short period, the membrane of the postsynaptic neuron would still be in its ‘Hebb-modifiable’ state, and the presynaptic afferents activated with the neuron.




Visnet


下一步的工作

Approaches to invariant object recognition Feature space

Regardless of the relative arrangement of the features

Some birds（pigeons）

Structural descriptions and syntactic pattern

3D descriptions

Necessary for language to provide description of objects

Template matching and the alignment

Active vision (some invertebrates)

Feature hierarchies and 2D view-based object recognition

Visnet




Visnet


下一步的工作

Visnet

Architecture of Visnet

The forward connections to individual cells are derived from a topologically corresponding region of the preceding layer , using a Gaussian distribution of connection probabilities.

Input to Visnet

Visnet

Outputs

Inputs


different positions( , )I x y

Input to Visnet

Camera

Visnet

Outputs

Inputs

filter

( , )I x y

( , , )xy f

( , )* ( , , )xyI x y f

2 2 2cos sin cos sin cos sin( ) ( ) ( )

2 1.6 2 3 21( , , ) [ ]

1.6

x y x y x y

f f f

xy f e e e

1 1 0.5 0.25 0.125 0.0625f

0 45 90 135


different positionsRetina

V1

Learning ProcessLearning Process（take layer 1 for example）

2.Competition and lateral inhibition

i j ij

j

h x w

*r h I

1.The activation of each neuronih



3.Contrast enhancement

i j ij

j

h x w

*r h I


is used to control the sparseness of firing rates

within each layer




4.Updating weights

i j ij

j

h x w

*r h I

ij i jw y x

1

(1 )i i iy y y





4.Updating weights

5.Return 1

i j ij

j

h x w

*r h I


ij ij ijw w w

( )ij i j ijw y x w

Testing ProcessTesting Process（take layer 1 for example）



i j ij

j

h x w

*r h I


Experiment Each image is 64*64 pixels and is shown at different

positions in the 128*128 “retina”.

The number of pixels by which the image was translated was 8 for each move.

Experiment 1

Experiment 2

1

ij i jw y x

Learning rule:

Experiment 2

ConclusionForming feature combination at the early stage of

processing

Trace learning rule

Solving translation invariance(responding the same local spatial arrangement ,ignoring the global position of the object)

Recognizing the object in different transforms in just a few seconds of inspection of an object

It would be less good for making actions in 3D space




Visnet


下一步的工作

结果分析输入

滤波器2 2 2cos sin cos sin cos sin

( ) ( ) ( )2 1.6 2 3 21

( , , ) [ ]1.6

x y x y x y

f f f

xy f e e e

结果分析输入

滤波器

输入连接

结果分析输入

滤波器

输入连接

的确定

结果分析输入

滤波器

输入连接

的确定

的确定

结果分析输入

滤波器

输入连接

Visnet的输出，作为竞争性网络的输出，用于分类

的确定

的确定




Visnet

Visnet 的实现过程及结果分析

下一步的工作

下一步的工作继续查找原因，实现Visnet针对平移不变性的功能；

对于View、size，测试Visnet的不变性

考虑反馈的作用

物体识别与立体视觉的关系

Thanks!

qingqun kong 2011.7 - iavision.ia.ac.cn/zh/senimar/reports/visnet.pdf · visnet a model of...

Documents