information geometry and neural netowrks

69
Information Geometry and Neural Netowrks Shun-ichi Ama RIKEN Brain Science Ins Orthogonal decomposition of rates and (higher-order) correlations Synchronous firing and higher correlations Algebraic singularities caused by multiple Dynamics of learning in multiplayer percep

Upload: franz

Post on 12-Jan-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Information Geometry and Neural Netowrks. Shun-ichi Amari RIKEN Brain Science Institute Orthogonal decomposition of rates and (higher-order) correlations Synchronous firing and higher correlations Algebraic singularities caused by multiple stimuli - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Information Geometry and Neural Netowrks

Information Geometryand Neural Netowrks

Shun-ichi Amari RIKEN Brain Science Institute  Orthogonal decomposition of rates and (higher-order) correlations

Synchronous firing and higher correlations

Algebraic singularities caused by multiple stimuli

Dynamics of learning in multiplayer perceptrons

Page 2: Information Geometry and Neural Netowrks

Information GeometryInformation GeometryInformation GeometryInformation Geometry

Systems Theory Information Theory

Statistics Neural Networks

Combinatorics PhysicsInformation Sciences

Riemannian ManifoldDual Affine Connections

Manifold of Probability Distributions

Math. AI

Page 3: Information Geometry and Neural Netowrks

2

2

1; , ; , exp

22

xS p x p x

Information GeometryInformation Geometry ? ?Information GeometryInformation Geometry ? ?

p x

;S p x θ

Riemannian metric

Dual affine connections

( , ) θ

Page 4: Information Geometry and Neural Netowrks

Manifold of Probability DistributionsManifold of Probability DistributionsManifold of Probability DistributionsManifold of Probability Distributions

1 2 3 1 2 3

1,2,3 { ( )}

, , 1

x p x

p p p p p p

3p

2p1p

p

;M p x

Page 5: Information Geometry and Neural Netowrks

Two StructuresTwo StructuresTwo StructuresTwo Structures

Riemannian metric and affine connectionRiemannian metric and affine connection

2

2

: log

1, : ,

2

ij i j

p

ds g d d

p xD p q E

q x

ds D p x p x d

Fisher informationFisher information

log logiji j

g E p p

Page 6: Information Geometry and Neural Netowrks

Riemannian Structure

2 ( )

( )

( ) ( )

Euclidean

i jij

T

ij

ds g d d

d G d

G g

G E

Page 7: Information Geometry and Neural Netowrks

Affine Connection

covariant derivative

geodesic X=X X=X(t)

( )

c

i jij

X Y

s g d d

minimal distance

straight line

Page 8: Information Geometry and Neural Netowrks

1 2{ ( , )}S p x x1 2, 0,1x x

1 2{ ( ) ( )}M q x q x

Independent Distributions

Page 9: Information Geometry and Neural Netowrks

Neural Firing

1x 2x 3x nx

higher-order correlations

orthogonal decomposition

1 2( ) ( , ,..., )np p x x xx

[ ]i iE x

[ , ]ij i jv Cov x x

----firing rate

----covariance

Page 10: Information Geometry and Neural Netowrks

Information Geometryof Higher-Order Correlations ----orthogonal decomposition

Information Geometryof Higher-Order Correlations ----orthogonal decomposition

Riemannian metric

dual affine connections

Pythagoras theorem

Dual geodesics

,S p x

Page 11: Information Geometry and Neural Netowrks

Correlations of Neural FiringCorrelations of Neural Firing

1 2

00 10 01 11

1 1

2 1

,

, , ,

p x x

p p p p

p

p

11 00

10 01

logp p

p p

1x 2x

2

1

1 2{( , ), } orthogonal coordinates

firing ratescorrelations

Page 12: Information Geometry and Neural Netowrks

   00110001011010100100110100

0101101001010

firing rates:correlation—covariance?

1x

2x

3x

00 01 10 11{ , , , }p p p p

1 2 12, ;

Page 13: Information Geometry and Neural Netowrks

1 2{ ( , )}S p x x1 2, 0,1x x

1 2{ ( ) ( )}M q x q x

Independent Distributions

Page 14: Information Geometry and Neural Netowrks

Pythagoras Theorem

p

qr

D[p:r] = D[p:q]+D[q:r]

p,q: same marginals

r,q: same correlations

1 2,

independent

correlations

( )[ : ] ( ) log

( )x

p xD p r p x

q x

estimation correlationtesting

invariant under firing rates

Page 15: Information Geometry and Neural Netowrks

01100101……. 110001011001……. 101000111100……. 1001

1x

2x

3x

No pairwise correlations, Triplewise correlation

1 2 3 1 2 3

1 2 1 2

( , , ) ( ) ( ) ( )

( , ) ( ) ( )

p x x x p x p x p x

p x x p x p x

Page 16: Information Geometry and Neural Netowrks

Pythagoras Decomposition of KL Divergence

( )p x

( )indp x

( )pairwise corrp x

only pairwise

independent

Page 17: Information Geometry and Neural Netowrks

Higher-Order Correlations

1 2, , ,

exp

n

i i ij i j ijk i j k

x x x

p x x x x x x

x

x

0M

1M

[ ]

[ ]i i

ij i j

E x

E x x

( , , ,...)

( , , ,...)

i ij ijk

i ij ijk

Page 18: Information Geometry and Neural Netowrks

Synfiring andHigher-Order Correlations

Amari, Nakahara, Wu, Sakai

Page 19: Information Geometry and Neural Netowrks

Neurons

1x nx

1i ix u

Gaussian [ ]i i ju E u u

2x

Population and Synfire

Page 20: Information Geometry and Neural Netowrks

Population and Synfire

hswu jiji ii ux 1

(1 )i iu h

, 0, 1i N

s

1x nx

2

[ ]

[ ] 1

i j

i

E u u

E u

Page 21: Information Geometry and Neural Netowrks

timesame at the fire neurons Prob ipi

(1 )

Pr{ 1} Pr{ 0}

i n in i

i i

C F F

F x u

Pr{ }1

i

h

Page 22: Information Geometry and Neural Netowrks

timesame at the fire neurons Prob ipi

Pr{ neurons fire}r

ir P nr

n

( , ) nH r nzq r e e d FrFr

nz 1 log 1 log

2

2

dt 2

1 2

0

2thaehaFF

Page 23: Information Geometry and Neural Netowrks

1 22 1( , ) exp[ { ( ) } ]

2(1 ) 2 1q r c F h

1 2

1...

( , ) exp{ ...}

(1/ )k

i i ij i j ijk i j k

ki i i

p x x x x x x

O n

x

Page 24: Information Geometry and Neural Netowrks

Synfiring

1( ) ( ,..., )

1n

i

p p x x

r x q rn

x

( )q r

r

Page 25: Information Geometry and Neural Netowrks

Bifurcation

r

rP

ix : independent---single delta peak pairwise correlated

higher-order correlation !

Page 26: Information Geometry and Neural Netowrks

Shun-ichi AmariRIKEN Brain Science Institute

[email protected]

Collaborators: Si Wu   Hiro Nakahara

Field Theory of Population CodingField Theory of Population Coding

Page 27: Information Geometry and Neural Netowrks

* *|x r z x

*r z f z x z

2

2exp

2

zf z

a

Population Coding and Neural Field

z

Page 28: Information Geometry and Neural Netowrks

Population Encoding

r z f z x z

ˆdecoding r z x

x

f (z-x)

r(z)

z

z

Page 29: Information Geometry and Neural Netowrks

Noise

2

2

22

0

' '

', ' 1 ' exp

2

z

n z z h z z

z zh z z n z z n

b

b

z

Page 30: Information Geometry and Neural Netowrks

Probability Model

2

12

( ) exp2

nQ r z x c r z f z x h r z f z x

1 1 , ' ' 'r z h r z r z h z z r z dzdz

1 ' ' '' ' '', , h z z h z z dz z z

r z f z x z

Page 31: Information Geometry and Neural Netowrks

Fisher information

2*

* | log

dx

xrQdExI

Cramer-Rao

)(

*

2*

xIxxE

Page 32: Information Geometry and Neural Netowrks

Fourier Analysis

1

2i zf z F f z e dz

' 1

2i zh z z H h z e dz

222

22

FnI d

H

Page 33: Information Geometry and Neural Netowrks

Fisher Information

2 2

2 2

2

22 2

21 2

a

b

n eI d

n b n e

3 2

3 2

1) No correlation 0

2) Uniform correlations

1

nI

ab

nI

a

Page 34: Information Geometry and Neural Netowrks

2 3

2

3) Limited range correlations

1

1 '

14) Wide range correlations:

10 1

5) Special case: 1, 2

cb

nn

Ia c

bn

I A dc

b a

I

Page 35: Information Geometry and Neural Netowrks

Dynamics of Neural Fields

, , ,

u z tu z t w z z u z t dz

uc r z

ShapingDetectingDecoding

Page 36: Information Geometry and Neural Netowrks

How the Brain Solves Singularity in Population Coding

S. Amari and H. Nakahara

RIKEN Brain Science Institute

Page 37: Information Geometry and Neural Netowrks

1x 2xZ

1x 2xZ

Page 38: Information Geometry and Neural Netowrks

Neural Activity

1 2

11 2 2

1

1; , , exp

2

log log

: Fisher information matrix

iji j

ij

r z v z x v z x z

Q r z v x x r f h r f

Q QI E

I I

Page 39: Information Geometry and Neural Netowrks

Parameter Space

v

1x2x

Page 40: Information Geometry and Neural Netowrks

2 1

1 2

1

: difference

1 : center of gravity

, ,

Fisher information degenerates as 0

Cramer-Raoparadigm: error

u x x

w v x vx

w u v

u

I

Page 41: Information Geometry and Neural Netowrks

2 2 1 3 3 1 1

2 3

1

2

3

; 1

1 1 2 1, ,

2 6

f z H z H z z

v v v v vw u u

g

I g

g

Page 42: Information Geometry and Neural Netowrks

: Jacobian singular

T

J

I J I J

Page 43: Information Geometry and Neural Netowrks

2

3

2

~ 1

1~

1~

1~i

w O

u Ou

v Ou

x Ou

w

Page 44: Information Geometry and Neural Netowrks

synfiring resolves singularity

1 1 2

2 1 2

phase 1:

:

f z v z x v z x

f z v z x v z x

1 , 1v v

: regular as 0I u

Page 45: Information Geometry and Neural Netowrks

1x 2xZ

1x 2xZ

Page 46: Information Geometry and Neural Netowrks

synfiring mechanism

1z

2z

common multiplicative noisecommon multiplicative noise

Page 47: Information Geometry and Neural Netowrks

S.Amari and H.Nagaoka,

Methods of Information GeometryAMS &Oxford Univ Press, 2000

Page 48: Information Geometry and Neural Netowrks

Mathematical Neurons

i iy w x h w x

x y( )u

u

Page 49: Information Geometry and Neural Netowrks

Multilayer Perceptrons

i iy v n w x

21; exp ,

2

, i i

p y c y f

f v

x x

x w x

x y

1 2( , ,..., )nx x x x

1 1( ,..., ; ,..., )m mw w v v

Page 50: Information Geometry and Neural Netowrks

Multilayer Perceptron

1 1,

,

, ; ,

i i

m m

y f

v

v v

x θ

w x

θ w w

neuromanifold( )x

space of functions

Page 51: Information Geometry and Neural Netowrks

Neuromanifold

• Metrical structure

• Topological structure

Page 52: Information Geometry and Neural Netowrks

Riemannian manifold

22

ij i j

T

ds d

g d d

d G d

j

i

d

log ( | ; ) log ( | ; )( ) [ ]ij

i j

p y x p y xg E

Page 53: Information Geometry and Neural Netowrks

Geometry of singular modelGeometry of singular model

y v n w x

v| | 0v w

Page 54: Information Geometry and Neural Netowrks

Gaussian mixtureGaussian mixture

1 2 1 2; , , 1p x v w w v x w v x w

21 1exp

22x x

1 2: singular , 1 0 w w v v

1w

2w

v

Page 55: Information Geometry and Neural Netowrks

Topological Singularities

S

M

Page 56: Information Geometry and Neural Netowrks

singularities

Page 57: Information Geometry and Neural Netowrks

Singularity of MLP---example

Page 58: Information Geometry and Neural Netowrks

Backpropagation ---gradient learningBackpropagation ---gradient learningBackpropagation ---gradient learningBackpropagation ---gradient learning

1 1

2

examples : , , , training set

1( , ; ) ,

2 log , ;

t ty y

E y x y f

p y

x x

x

x

,

t t

i i

E

f v

x w x

Page 59: Information Geometry and Neural Netowrks

Information Geometry of MLPInformation Geometry of MLP

Natural Gradient Learning : S. Amari ; H.Y. Park

1

1 1 1 11 1 T

t t t t

EG

G G G f f G

Page 60: Information Geometry and Neural Netowrks

1 1 2 2( ) ( )y v w x v w x n

1 2

1 2

w w w

v v v

2 1

2 1

u w w

z v v

x y

1w

2w

z

1w

2w

1v

2v

Page 61: Information Geometry and Neural Netowrks
Page 62: Information Geometry and Neural Netowrks

2 hidden-units

1 1 2

1 2

1 2

2 1

2 1

2

: y v v n

w w w

v

u w w

v vz

v

v

v

2w x w x

Page 63: Information Geometry and Neural Netowrks

Dynamics of Learning

1,

( , ), ( , )

( , ),

( , )

d dl G l

dt dt

du dzf u z k u z

dt dt

du f u z

dz k u z

2 2 1

log2

u z z c

Page 64: Information Geometry and Neural Netowrks

The teacher is on singularity

2 2 3

2 4

2

1( )4

1( )4

1( )4

duA z u

dtdz

A z zudt

dz zu

du z

2 2 1log

2u z z c

Page 65: Information Geometry and Neural Netowrks

The teacher is on singularity

2 2 3

2 4

2

1( )4

1( )4

1( )4

duA z u

dtdz

A z zudt

dz zu

du z

2 2 1log

2u z z c

Page 66: Information Geometry and Neural Netowrks
Page 67: Information Geometry and Neural Netowrks
Page 68: Information Geometry and Neural Netowrks
Page 69: Information Geometry and Neural Netowrks