john p. cunningham, byron m. yu, krishna v. shenoy and ... · john p. cunningham, byron m. yu,...
TRANSCRIPT
Empirical models of spiking in neural populations
Jakob H. Macke, Lars Büsing,
John P. Cunningham, Byron M. Yu,
Krishna V. Shenoy and Maneesh Sahani
NIPS 2011December 15, 2011
() 1 / 19
Characterizing neural functionI Single neuron recordings
I Simultaneous recordings from populations of neurons
−400 −200 0 200 400 600 800 1000 1200 1400
10
20
30
40
50
60
70
80
90
Neuro
n index
Time
→ need for appropriate statistical models
() 2 / 19
Characterizing neural functionI Single neuron recordings
I Simultaneous recordings from populations of neurons
−400 −200 0 200 400 600 800 1000 1200 1400
10
20
30
40
50
60
70
80
90
Neuro
n index
Time
→ need for appropriate statistical models() 2 / 19
Models of population activity: direct couplings or latent variables?
model class I
model class II
generalized linear models (GLM)
latent dynamical system (DS)
externalinput
directcouplings
observedneurons
externalinput
dynamical system
observedneurons
e.g. [Chornoboy et al., Biol Cybern, 1988]
[Gat et al., 97, Network], [Yu et al., NIPS, 2006]
[Paninski et al., Neuro Comp, 2004]
() 3 / 19
Models of population activity: direct couplings or latent variables?
model class I model class II
generalized linear models (GLM) latent dynamical system (DS)
externalinput
directcouplings
observedneurons
externalinput
dynamical system
observedneurons
e.g. [Chornoboy et al., Biol Cybern, 1988] [Gat et al., 97, Network], [Yu et al., NIPS, 2006][Paninski et al., Neuro Comp, 2004]
() 3 / 19
Statistical models offer insights into neural computation
model class I
I GLMs improve decoding of visualstimuli from retina recordings
[Pillow et al., Nature, 2008]
model class II
I DS facilitate single trial analysis ofcortical multi-electrode recordings
[Chruchland et al, Nature Neurosci., 2010]
I Brain-computer interfaces[Wu et al., IEEE Tans. Neural Systems, 2000]
→What are good models for cortical multi-electrode recordings?
() 4 / 19
Statistical models offer insights into neural computation
model class I
I GLMs improve decoding of visualstimuli from retina recordings
[Pillow et al., Nature, 2008]
model class II
I DS facilitate single trial analysis ofcortical multi-electrode recordings
[Chruchland et al, Nature Neurosci., 2010]
I Brain-computer interfaces[Wu et al., IEEE Tans. Neural Systems, 2000]
→What are good models for cortical multi-electrode recordings?
() 4 / 19
Statistical models offer insights into neural computation
model class I
I GLMs improve decoding of visualstimuli from retina recordings
[Pillow et al., Nature, 2008]
model class II
I DS facilitate single trial analysis ofcortical multi-electrode recordings
[Chruchland et al, Nature Neurosci., 2010]
I Brain-computer interfaces[Wu et al., IEEE Tans. Neural Systems, 2000]
→What are good models for cortical multi-electrode recordings?
() 4 / 19
Data: Multi-electrode recordings from primate cortex
−250 0 250 500 750 1000 1250
10
20
30
40
50
60
70
80
90
Exam
ple
raste
r (s
ingle
trial)
N
euro
n index
Target on Go cue
−250 0 250 500 750 1000 12500
10
20
30Data used in analyses
Popula
tion P
ST
H
R
ate
(hz)
I Data from Shenoy’s lab
I Array with 96 electrodes in motor / premotor areas
I 92 units = dimensions, 120 s of recordings , 10ms binning
() 5 / 19
Model class I: Generalized linear models (GLMs)
I Instantaneous firing rate λi(t) of neuron i
λi(t) = exp([
det. innovations︷︸︸︷b(t) + D︸︷︷︸
couplings
·
spike history︷︸︸︷s(t) ]i)
I Poisson observations
yi(t) | s(t) = Poisson(λi(t))
I We used up to five different decayingexponentials as basis functions for the spikehistory:time constants 0.1 - 80ms
b(t)
I We used L1-regularization of the coupling matrix D to avoid over-fitting
e.g. [Chornoboy et al., Biol Cybern, 1988], [Paninski et al., Neuro Comp, 2004]
() 6 / 19
Model class I: Generalized linear models (GLMs)
I Instantaneous firing rate λi(t) of neuron i
λi(t) = exp([
det. innovations︷︸︸︷b(t) + D︸︷︷︸
couplings
·
spike history︷︸︸︷s(t) ]i)
I Poisson observations
yi(t) | s(t) = Poisson(λi(t))
I We used up to five different decayingexponentials as basis functions for the spikehistory:time constants 0.1 - 80ms
b(t)
I We used L1-regularization of the coupling matrix D to avoid over-fitting
e.g. [Chornoboy et al., Biol Cybern, 1988], [Paninski et al., Neuro Comp, 2004]
() 6 / 19
Model class I: Generalized linear models (GLMs)
I Instantaneous firing rate λi(t) of neuron i
λi(t) = exp([
det. innovations︷︸︸︷b(t) + D︸︷︷︸
couplings
·
spike history︷︸︸︷s(t) ]i)
I Poisson observations
yi(t) | s(t) = Poisson(λi(t))
I We used up to five different decayingexponentials as basis functions for the spikehistory:time constants 0.1 - 80ms b(t)
I We used L1-regularization of the coupling matrix D to avoid over-fitting
e.g. [Chornoboy et al., Biol Cybern, 1988], [Paninski et al., Neuro Comp, 2004]
() 6 / 19
Model class I: Generalized linear models (GLMs)
I Instantaneous firing rate λi(t) of neuron i
λi(t) = exp([
det. innovations︷︸︸︷b(t) + D︸︷︷︸
couplings
·
spike history︷︸︸︷s(t) ]i)
I Poisson observations
yi(t) | s(t) = Poisson(λi(t))
I We used up to five different decayingexponentials as basis functions for the spikehistory:time constants 0.1 - 80ms b(t)
I We used L1-regularization of the coupling matrix D to avoid over-fitting
e.g. [Chornoboy et al., Biol Cybern, 1988], [Paninski et al., Neuro Comp, 2004]
() 6 / 19
Model class II: Latent dynamical system (DS) models
I Latent variable x(t) evolves linearly, withGaussian innovations
x(t + 1) = A x(t)︸ ︷︷ ︸dynamics
+
innovations︷ ︸︸ ︷b(t) + η(t)
I Gaussian linear dynamical system (GLDS):
y(t) | x(t) ∼ N (Cx(t) + d,R)
I Poisson linear dynamical system (PLDS):
yi(t) | x(t) ∼ Poisson(exp(Ci,:x(t) + di + Disi(t)))
I Parameters were learned by the EM algorithm
I We used a global Laplace approximation to the posterior for the PLDS
() 7 / 19
Model class II: Latent dynamical system (DS) models
I Latent variable x(t) evolves linearly, withGaussian innovations
x(t + 1) = A x(t)︸ ︷︷ ︸dynamics
+
innovations︷ ︸︸ ︷b(t) + η(t)
I Gaussian linear dynamical system (GLDS):
y(t) | x(t) ∼ N (Cx(t) + d,R)
I Poisson linear dynamical system (PLDS):
yi(t) | x(t) ∼ Poisson(exp(Ci,:x(t) + di + Disi(t)))
I Parameters were learned by the EM algorithm
I We used a global Laplace approximation to the posterior for the PLDS
() 7 / 19
Model class II: Latent dynamical system (DS) models
I Latent variable x(t) evolves linearly, withGaussian innovations
x(t + 1) = A x(t)︸ ︷︷ ︸dynamics
+
innovations︷ ︸︸ ︷b(t) + η(t)
I Gaussian linear dynamical system (GLDS):
y(t) | x(t) ∼ N (Cx(t) + d,R)
I Poisson linear dynamical system (PLDS):
yi(t) | x(t) ∼ Poisson(exp(Ci,:x(t) + di + Disi(t)))
I Parameters were learned by the EM algorithm
I We used a global Laplace approximation to the posterior for the PLDS
() 7 / 19
Model class II: Latent dynamical system (DS) models
I Latent variable x(t) evolves linearly, withGaussian innovations
x(t + 1) = A x(t)︸ ︷︷ ︸dynamics
+
innovations︷ ︸︸ ︷b(t) + η(t)
I Gaussian linear dynamical system (GLDS):
y(t) | x(t) ∼ N (Cx(t) + d,R)
I Poisson linear dynamical system (PLDS):
yi(t) | x(t) ∼ Poisson(exp(Ci,:x(t) + di + Disi(t)))
I Parameters were learned by the EM algorithm
I We used a global Laplace approximation to the posterior for the PLDS
() 7 / 19
Measuring performance: predicting neural firing from the population
I On test trials, for every neuron i :
I Compute cross-prediction E[yi,:|y\i,:] of neuron yi,: given all other neurons y\i,:
I Compute MSE between prediction and true trajectory
I We report improvement over a constant predictor (i.e. higher = better)
() 8 / 19
Cross-prediction: Dynamical systems outperform GLMs
0 20 40 60 80 1000
0.5
1
1.5
2
2.5
3x 10
−3
Percentage of zero−entries in coupling matrix
Var
−M
SE
GLM 10msGLM 100msGLM2 100msGLM 150msPLDS dim=5
I GLM performance as a function of L1-regularization parameter:controls zero-entries in coupling matrix
I PLDS with 5-dimensional state space
I For our recordings PLDS outperforms GLMs with different history filters
() 9 / 19
Cross-prediction: Poisson observations improve performance
2 4 6 8 10 12 14 16 18 202
2.2
2.4
2.6
2.8
3
3.2x 10
−3
Dimensionality of latent space
Var
−M
SE
GLDSPLDSPLDS 100ms
I Poisson observation model improves performance over Gaussian models
I Single neuron history filters result in an slight increase in performance
() 10 / 19
Matching statistics of the data I: temporal cross-correlations
latent dynamical systems (DS)
generalized linear models (GLM)
−100 −50 0 50 100
0.01
0.02
0.03
0.04
0.05
Time lag (ms)
Cor
rela
tion
GLDSPLDSPLDS 100msrecorded data
−100 −50 0 50 100
0.01
0.02
0.03
0.04
0.05
Time lag (ms)
Cor
rela
tion
GLM 10msGLM 100msGLM 150srecorded data
I Data: peaks at zero time lag and broad flanks
I DS: matches cross-correlations well
I GLM: difficulties to reproduce peak due to lack of instantaneous couplings
() 11 / 19
Matching statistics of the data I: temporal cross-correlations
latent dynamical systems (DS) generalized linear models (GLM)
−100 −50 0 50 100
0.01
0.02
0.03
0.04
0.05
Time lag (ms)
Cor
rela
tion
GLDSPLDSPLDS 100msrecorded data
−100 −50 0 50 100
0.01
0.02
0.03
0.04
0.05
Time lag (ms)
Cor
rela
tion
GLM 10msGLM 100msGLM 150srecorded data
I Data: peaks at zero time lag and broad flanks
I DS: matches cross-correlations well
I GLM: difficulties to reproduce peak due to lack of instantaneous couplings
() 11 / 19
Matching statistics of the data II: population spike counts
0 10 20 30 400
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Number of spikes in 10ms bin
Fre
quen
cy
GLMGLM 2GLDSPLDSreal data
I Gaussian DS: under-estimates large population events
I Poisson DS: good match to the data
I GLM: no regularization; GLM2: optimal regularization
() 12 / 19
How much variance of the data can we explain using cross-prediction?
% of explained variance
Comparison with PSTH predictor
0 50 100 150 200 250 300 350 4000
5
10
15
20
25
Binsize (ms)
Per
cent
age
of v
aria
nce
expl
aine
d
PLDS, d=1d=2d=5d=10d=20
0 50 100 150 200 250 300 350 400
0.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
Binsize (ms)
Impr
ovem
ent o
ver
PS
TH
pre
dict
or
PLDS, d=1d=2d=5d=10d=20
I For small bins, more variance can be explained than would be possible if firingconditioned on latent state were Poisson
I PLDS models can explain up to twice as much variance as a model withindependent activity conditioned on PSTH
() 13 / 19
How much variance of the data can we explain using cross-prediction?
% of explained variance
Comparison with PSTH predictor
0 50 100 150 200 250 300 350 4000
5
10
15
20
25
Binsize (ms)
Per
cent
age
of v
aria
nce
expl
aine
d
d=1d=2d=5d=10d=20Poisson
0 50 100 150 200 250 300 350 400
0.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
Binsize (ms)
Impr
ovem
ent o
ver
PS
TH
pre
dict
or
PLDS, d=1d=2d=5d=10d=20
I For small bins, more variance can be explained than would be possible if firingconditioned on latent state were Poisson
I PLDS models can explain up to twice as much variance as a model withindependent activity conditioned on PSTH
() 13 / 19
How much variance of the data can we explain using cross-prediction?
% of explained variance Comparison with PSTH predictor
0 50 100 150 200 250 300 350 4000
5
10
15
20
25
Binsize (ms)
Per
cent
age
of v
aria
nce
expl
aine
d
d=1d=2d=5d=10d=20Poisson
0 50 100 150 200 250 300 350 400
0.8
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
Binsize (ms)
Impr
ovem
ent o
ver
PS
TH
pre
dict
or
PLDS, d=1d=2d=5d=10d=20
I For small bins, more variance can be explained than would be possible if firingconditioned on latent state were Poisson
I PLDS models can explain up to twice as much variance as a model withindependent activity conditioned on PSTH
() 13 / 19
Summary & conclusions
I Dynamical systems match the investigated cortical recordings better than GLMs
I Cross-prediction
I Various statistics, e.g. cross-correlation structure
I GLM modelling assumptions (direct couplings) might not be appropriate for thistype of data:
I Multi-electrode arrays sample a neural population very sparsely
I Most inputs presumably come from outside the sampled population
I Latent dynamical systems are arguably the more natural models formulti-electrode recording from cortex
() 14 / 19
Summary & conclusions
I Dynamical systems match the investigated cortical recordings better than GLMs
I Cross-prediction
I Various statistics, e.g. cross-correlation structure
I GLM modelling assumptions (direct couplings) might not be appropriate for thistype of data:
I Multi-electrode arrays sample a neural population very sparsely
I Most inputs presumably come from outside the sampled population
I Latent dynamical systems are arguably the more natural models formulti-electrode recording from cortex
() 14 / 19
Summary & conclusions
I Dynamical systems match the investigated cortical recordings better than GLMs
I Cross-prediction
I Various statistics, e.g. cross-correlation structure
I GLM modelling assumptions (direct couplings) might not be appropriate for thistype of data:
I Multi-electrode arrays sample a neural population very sparsely
I Most inputs presumably come from outside the sampled population
I Latent dynamical systems are arguably the more natural models formulti-electrode recording from cortex
() 14 / 19
Acknowledgements
I Jakob Macke (Gatsby Unit, UCL)
I Maneesh Sahani (Gatsby Unit, UCL)
I Krishna Shenoy & lab (Stanford)
I John Cunningham (Cambridge)
I Byron Yu (Carnegie Mellon)
Funding:
I Gatsby Charitable FoundationI EU-FP7 Marie Curie FellowshipI DARPA: REPAIR N66001-10-C-2010I EPSRC EP/H019472/1I NIH CRCNS R01-NS054283I NIH Pioneer 1DP1OD006409
() 15 / 19
() 16 / 19
Additional material
() 17 / 19
Cross-prediction: area under ROC
0 20 40 60 80 100
0.58
0.6
0.62
0.64
0.66
0.68
Percentage of zero−entries in coupling matrix
AU
C
GLM 10msGLM 100msGLM2 100msGLM 150msPLDS dim=5
5 10 15 200.64
0.65
0.66
0.67
0.68
Dimensionality of latent space
AU
C
GPFAGLDSPLDSPLDS 100ms
I Same qualitative results as measured by MSE
I Benefits of history filters for PLDS are more pronounced
() 18 / 19
Correlations in data are well reflected by DS models
data bin at 10ms 5-dimensional PLDS
Neuron index
Neuro
n index
−0.1
−0.05
0
0.05
0.1
Neuron index
Neuro
n index
−0.1
−0.05
0
0.05
0.1
() 19 / 19