information geometry of statistical...

33
Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed data D = {x , x , …, x } iid Estimation Hypothesis testing

Upload: others

Post on 15-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Information Geometry of Statistical Inference

Statistical model    M = {p(x,    )}Observed data      D = {x , x , …, x }      iid

Estimation

Hypothesis testing

Page 2: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Exponential family :

, exp

= exp{N( }

1 observed point =

i

i

p D N

N

x

x

x x x

E x : negentropy

Mle (maximum likelihood estimator)

Page 3: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Estimation error

Cramer‐Rao bound

Page 4: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

curved exponential family:

, expp D u u u x

: estimator

u

ˆ x

1, 2( , ) ( , ( )) ,... np x u p x u x x x

( , ) exp{ ( )}p x x

1ˆ( ,..., )nu x x

Page 5: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Ancillary family

Estimator   ‐‐‐ ˆ ( )u f

Ancillary family  ( )A

Page 6: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Mathematical Analysis of Error(u, v)‐coordinates along A(u)

( , )

( , )u v

u v

u e e

Consistent

Efficient

Higher‐order efficient

Page 7: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Mle is consistent and efficient

Efficient estimator ‐‐‐ orthogonal projection

Page 8: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

High-Order AsymptoticsHigh-Order Asymptotics

1

1

, (u) : , ,

u u , ,n

n

p x x x

x x

ˆ ˆ Te E u u u u

1 22

1 1e G Gn n

11G G :Cramér-Rao: linear theory

2 2 2

2e m mM AG H H

:

u

ˆ x

quadratic approximation

Page 9: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Hypothesis Testing

Page 10: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Neyman‐Scott Problem

Estimation with nuisance parameter{ ( , , )}M p x

Efficient score

Page 11: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Neyman‐Scott problem

1 1

2 2

{ ( , , )} ( , , ) ( , , )

( , , )N N

M p xx p xx p x

x p x

u:  parameter of interest

v:  nuisance parameter

Page 12: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Semiparametric Statistical ModelSemiparametric Statistical Model

, ,

( )

M p x Z

Z

y x

'i i i

i i i

yx

mle, least square, total least square

, ; , , ; ,p x y Z p x y Z d

x

y

linear relation ( , )x yx

Page 13: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Statistical Model

2 2

1

1 1, , exp2 2

, , : , , ,

, , , ,

i i i n

p x y c x y

p x y

p x y Z p x y Z d

semiparametric

Page 14: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Least squares?

2

2ˆmin :

1 ,

0

Neyman-Sc

ml

ott

e, TLS

i ii i

i

ii

i i

i i i i

x yL y x

x

yyn x x

y x y x

Page 15: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

'

1 2

,

,

, , , ,

, , 0

, 0

ˆ, 0

Z

Z

i

x x p x Z

f x E f x

E f x

f x

Estim ating function

Estim ati

Sem iparam etric statistical m

ng equa

ode

tion

l

Page 16: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

estimating function

, unbiase, 0 : dZE f x

1

ˆ ˆ, 0 : = +n

iif x e

22

2

1ˆ E fE

n E f

Page 17: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Fiber Bundle

, ; , log

, ; , log

u x y Z p

v x y Z pZ

{ , , }p x y Z

Z

Page 18: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Parallel Transport , , 0Z ZT r x E r x

1 2 1 2,r r E r x r x

,z

zzr x r x E r x

e

, ,( , , )

z

z

p x zr x r x

p x z

m

1 2 1 2, ,z z

z z zr r r r

e m

Z

,p x ,ZT

Page 19: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Estimating Function ,f x

I N AT T T T

, , : optimal estimating function Iu x z

,var : , 0

,

: , 0

, 0

Z

z

z

z

z

e in iant E f x

f x f

m orthogonality v f

v f

m

e

Page 20: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed
Page 21: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Example of estimating functions

, ; :

, , , ;

f x y k x y y x s x y

p x y Z f x y Z dxdyd

0,0,

0i i i ik x y y x

Page 22: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

, ; , log

, ; ,

, ;

, ,I

u x y Z p

sE s

v x y Z E f s

k s x y

u x y u E u s

k x y y x

Page 23: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

2

2

,

, ,I

z N

u x Z x y c y x

c

2

22 2 2

, ;

1

1

i

i

f x c x y c y x

c

xn

xn

Page 24: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

2

2 22

2 2

22

2

, ; 0

, ;

1

2

21 30 : :4 4

1 1 21: 1 :32

1: :1

i if x y

f x y x y c y x

c

c Vn

c Vn

c vn

Page 25: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Poisson process

Poisson Process: Instantaneous firing rate is constant over time.dt

For every small time window dt, generate a spike with probability ξdt.

T

Cortical Neuron Poisson Process

T

Poisson process cannot explain inter-spike interval distributions.

TeTp )((Softky & Koch, 1993)

Page 26: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Gamma distribution

Gamma Distribution: Every κ-th spike of the Poisson process is left.ξ: Firing rateκ: Irregularity

.T)(

)(),q(T; T1

e {Two parameters

κ=1 (Poisson)

κ=3

T

Page 27: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Gamma distribution

1 expf T T T

1

Integrate-and fire

Markov model

: Poisson

: regular

Page 28: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Irregularity κ is unique to individual neurons.

t

t

t

Regular(largeκ)

Irregular(smallκ)Irregularity varies among neurons.

(Baker & Lemon 2000; Shinomoto et.al., 2003)

We assume that κ is independent of time.

Page 29: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

estimating function

•Estimating function f(T,κ):

)(),(log),(

),(log),(

ξkkT;κpkT;κv

dκkT;κpdkT;κu

How to obtain an estimating function y:

•Maximum likelihood Method:

u vf u vv v

Score functions

E[f(T,κ)]=01

( ; ) 0N

ll

f T

0);()()(log

11

N

llN TuTpTp

dd

Page 30: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

Estimation of κ                   by estimating functions

1. No estimating function exists if the neighboring firing rates are different.2. m(≧2) consecutive observations must have the same firing rate.

Estimating function: (E[f]=0)

Example: m=2

1 22

1 2

( , ) log 2 (2 ) 2 ( )TTf T φ κ φ κT T

ξl-th set:

0)(2)2(2log11

2

21

21

κφκφTT

TTN

yN

lll

ll

0

2121 )(),;(),;(),;,( dkTqTqkTTp

)( 21ll ,TT

Model:

Page 31: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

em‐algorithm    EM‐algorithm

Page 32: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed

EM algorithm

hidden variables

, ;p x y u

1, , ND x x

, ;M p x y u

,M DD p p p x y x x

ˆmin , :KL p p M x y m-projection to M

De-projection to ˆmin : , ;KL p D p x y u

Page 33: Information Geometry of Statistical Inferenceimage.diku.dk/MLLab/SummerSchools/SlidesAmariIG2.pdf · Information Geometry of Statistical Inference Statistical model M = {p(x, )} Observed