neural coding and information theory: grandmother cells … · neural coding and information...

29
Neural coding and information theory: Grandmother cells v. distributed codes John Collins 1/20

Upload: vanquynh

Post on 21-Aug-2018

229 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Neural coding and information theory: Grandmothercells v. distributed codes

John Collins

1/20

Page 2: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Strengths of computers v. brains

Computer Brain

Accurate and fast computation

Accurate and fast storage

Doesn’t get bored with repetitive tasks

Recognizing people

Reading handwriting

Adaptive

Avoids obstacles

Invents computers

Survives and reproduces

2/20

Page 3: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Summary

• Primer about neurons

(Neural computation: Inspiration for new computational methods. )E.g., FSA, ANN, . . .

• Issues “local coding”, “distributed coding”, “grandmother cells”

• Analyze computationally (large scale systems!)

• Analysis of data. Extreme detection bias

See JCC + Dezhe Jin, “Grandmother cells and the storage capacity of the human brain”,

http://arXiv.org/abs/q-bio.NC/0603014.

Mental health warning: Not (yet) generally accepted!

3/20

Page 4: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Neuron

• Dendrite / dendritic tree: inputAxon: output

• Connection: synapse, ∼ 1/severalµm• Dimensions (v. rough):

– Cell body: ∼ 10 µm– Axon: diameter 0.1 to 20 µm,

length: often several cm– Several km of axon per mm3 of brain

• Signals:– Membrane potential– Action potential: Spike propagates at ∼ 60 m/s

• Human brain:– Neurons: ∼ 1011 neurons– Synapses: Per neuron ∼ 104; total ∼ 1015

4/20

Page 5: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Computation with action potentials

• Input: across synapses from other neurons; + and −

• Membrane potential above threshold =⇒ action potential

• Active, non-linear

• Suitable for universal computational component

5/20

Page 6: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Neuron characterization

• Roughly: Neuron fires action potential when total input (in ∼ 10 ms):∑other cells j

wijyj

is above threshold

– wij is strength of synapse to i from j

– yj is (pulsed) signal on neuron j

• Signal propagation controlled by: connection topology, synaptic strengths

• Time scales:

– Processing (like CPU, but highly parallel, ?> 106 GFLOPS?):∗ Action potential: Width ∼ 1 ms; processing step ∼ 10 ms∗ From stimulus to recognition: . 200 ms∗ Feedforward and re-entrant connectivity

– Programming and storage (∼ 106 GB):∗ Synaptic plasticity: & 1 s∗ Synaptogenesis/deletion∗ Adult neurogenesis (in some areas): ∼ 2 to 4 weeks

6/20

Page 7: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Memory formation and retrieval

• Computational task:

– E.g., face =⇒ (person ID or episode) =⇒ action. In < 300 ms

• Coding by neurons and synapses:

– activity state– storage state

• Computational, quantitative, algorithmic analysis

– What is coded and stored ?– Data structure ⇐⇒ Efficiency, algorithms ⇐⇒ Slow and fast learning– Large number of similar components– Micro- v. macro-behavior

• Of course: component of bigger system

7/20

Page 8: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

What is known and/or measured(Enormous amount of data!)

• Psychology: Behavior, etc

• Anatomical etc. =⇒ functional regions (e.g., hippocampus for memory)

• EEG: v. poor space resolution

• fMRI: poor space resolution, v. poor time resolution

• . . .

• Extracellular and intracellular electrodes

• . . .

• Biochemistry, etc

– Not known: Overall view

8/20

Page 9: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Hierarchy of processing

=⇒

objects, ...... ... ... ...

retina lines, ...

• Hierarchy: Points → lines → · · · → objects, etc

• Known: Direct meaning of many intermediate cells

• 10–20 feedforward steps. But important feedback connections

9/20

Page 10: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Some responses of one neuron:

From: Zigmond, Bloom, Landis,

Roberts, & Squire “Fundamental

Neuroscience” (Academic, 1999) p.

1351, from Bruce, Desimone, & Gross, J.

Neurophysiol. 46, 369 (1981)

10/20

Page 11: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

The debugger: Extracellular recordings

~5 detected

electrode

~150 cells in range

• Factor K ∼ 30 more cells in range of electrode than detected

• Detected:

– Below threshold: “spontaneous”/”background” firing– Above threshold for ≥ 1 stimulus: “responsive”

• Many “silent cells”. I.e., not reported in paper!

11/20

Page 12: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Recent data: Halle Berry cell

the false positive rate (x axis) the relative number of responses toother pictures. The ROC curve corresponds to the performance of alinear binary classifier for different values of a response threshold.Decreasing the threshold increases the probability of hits but also offalse alarms. A cell responding to a large set of pictures of differentindividuals will have a ROC curve close to the diagonal (with an areaunder the curve of 0.5), whereas a cell that responds to all pictures ofan individual but not to others will have a convex ROC curve far fromthe diagonal, with an area close to 1. In Fig. 1c we show the ROCcurve for all seven pictures of Jennifer Aniston (red trace, with an areaequal to 1). The grey lines show 99 ROC surrogate curves, testinginvariance to randomly selected groups of pictures (see Methods). Asexpected, these curves are close to the diagonal, having an area ofabout 0.5. None of the 99 surrogate curves had an area equal or largerthan the original ROC curve, implying that it is unlikely (P , 0.01)

that the responses to Jennifer Aniston were obtained by chance. Aresponsive unit was defined to have an invariant representation if thearea under the ROC curve was larger than the area of the 99 surrogatecurves.Figure 2 shows another single unit located in the right anterior

hippocampus of a different patient. This unit was selectively acti-vated by pictures of the actress Halle Berry as well as by a drawing ofher (but not by other drawings; for example, picture no. 87). Thisunit was also activated by several pictures of Halle Berry dressed asCatwoman, her character in a recent film, but not by other images ofCatwoman that were not her (data not shown). Notably, the unit wasselectively activated by the letter string ‘Halle Berry’. Such aninvariant pattern of activation goes beyond common visual featuresof the different stimuli. As with the previous unit, the responses weremainly localized between 300 and 600ms after stimulus onset.

Figure 2 | A single unit in the right anterior hippocampus that responds to

pictures of the actress Halle Berry (conventions as in Fig. 1).

a–c, Strikingly, this cell also responds to a drawing of her, to herself dressedas Catwoman (a recent movie in which she played the lead role) and to the

letter string ‘Halle Berry’ (picture no. 96). Such an invariant response cannotbe attributed to common visual features of the stimuli. This unit also had avery low baseline firing rate (0.06 spikes). The area under the red curve in c is0.99.

LETTERS NATURE|Vol 435|23 June 2005

1104

© 2005 Nature Publi shing Group

the false positive rate (x axis) the relative number of responses toother pictures. The ROC curve corresponds to the performance of alinear binary classifier for different values of a response threshold.Decreasing the threshold increases the probability of hits but also offalse alarms. A cell responding to a large set of pictures of differentindividuals will have a ROC curve close to the diagonal (with an areaunder the curve of 0.5), whereas a cell that responds to all pictures ofan individual but not to others will have a convex ROC curve far fromthe diagonal, with an area close to 1. In Fig. 1c we show the ROCcurve for all seven pictures of Jennifer Aniston (red trace, with an areaequal to 1). The grey lines show 99 ROC surrogate curves, testinginvariance to randomly selected groups of pictures (see Methods). Asexpected, these curves are close to the diagonal, having an area ofabout 0.5. None of the 99 surrogate curves had an area equal or largerthan the original ROC curve, implying that it is unlikely (P , 0.01)

that the responses to Jennifer Aniston were obtained by chance. Aresponsive unit was defined to have an invariant representation if thearea under the ROC curve was larger than the area of the 99 surrogatecurves.Figure 2 shows another single unit located in the right anterior

hippocampus of a different patient. This unit was selectively acti-vated by pictures of the actress Halle Berry as well as by a drawing ofher (but not by other drawings; for example, picture no. 87). Thisunit was also activated by several pictures of Halle Berry dressed asCatwoman, her character in a recent film, but not by other images ofCatwoman that were not her (data not shown). Notably, the unit wasselectively activated by the letter string ‘Halle Berry’. Such aninvariant pattern of activation goes beyond common visual featuresof the different stimuli. As with the previous unit, the responses weremainly localized between 300 and 600ms after stimulus onset.

Figure 2 | A single unit in the right anterior hippocampus that responds to

pictures of the actress Halle Berry (conventions as in Fig. 1).

a–c, Strikingly, this cell also responds to a drawing of her, to herself dressedas Catwoman (a recent movie in which she played the lead role) and to the

letter string ‘Halle Berry’ (picture no. 96). Such an invariant response cannotbe attributed to common visual features of the stimuli. This unit also had avery low baseline firing rate (0.06 spikes). The area under the red curve in c is0.99.

LETTERS NATURE|Vol 435|23 June 2005

1104

© 2005 Nature Publi shing Group

[From: Quian Quiroga et al, Nature 435, 1102 (2005)]

12/20

Page 13: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Distributed v. Local (“Grandmother-Cell”) Representations

• Distributed representation: overlapping firing between categories

• GM categorizer: Exclusive firing of groups of cells between categories

13/20

Page 14: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Examples for distributed coding v. grandmother cells

• GM cell responds to small fraction of stimuli:

Halle-Berry cell Seeing her face, her name

Memory for wedding The wedding couple, parents, . . .

• Distributed-code cell responds to many more stimuli:

Female face All female film star, any bride, . . .

Beard Any bearded person

14/20

Page 15: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

But:

• Distributed representation: overlapping firing between categories

• Reality: GM cells (if they exist at all) need distributed input/output:

...

G... ...

ProcessedinputRaw input Output

• Therefore GM categorizer is actually

15/20

Page 16: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Basic memory cell model: Possible sub-systemInput

Output

GM mem.cells

• Learning: activate synapses to/from unallocated/new memory cell

• “Memory cells”: GM-like cells. (Expect huge number.)

• Inhibitory interneurons to improve performance

• Advantages/features:

• Sparse binary (distributed) input. (Feature detectors?)=⇒ One-trial learning, unimportant interference=⇒ Recall: partial stimulus can activate memory• Maximally efficient in use of synapses. (Information storage)

16/20

Page 17: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Why not GM cells?

• Textbooks say: Representationally hopelessly inefficient; impossible

• But this argument incorrect for memory storage

• Properties. E.g.:

How often responsive? 5% 1/105

How many cells? 0.2% > 99%

• =⇒ We link to data on sample:

– General method of analyzing sample of neuronal responses– Deduce population fractions and “sparsities”– Implications

17/20

Page 18: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

2-population analysis

Data:

• 60 000 cells, 93.9 images

• 132 responsive cells– 51 respond to 1 image (of 93.9)– 81 respond to ≥ 2. Avg. 4.1

• Rest non-face (silent) 1 2 3 4 5 6 7 8 9 10

10

20

30

40

50

Model:

fD = 0.2% Distrib. Sparsity a = 4%

fGM GM-like Repertoire R

Rest Silent Non-responsive

43 GM cells:

• 43 =1R

fGM (93.9× 60 000)

⇒ R ∼ 105fGM

• I.e., up to 105 categories for classicGM cells

• Many more for memory cells a laBMW

18/20

Page 19: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Analysis by experimental group

• Single population, single sparsity

• Less incisive analysis: less informative observables

• Fit sparsity 0.23% to 0.54%

• Actually poor fit, with compromise between two populations

• Contrast our fit: Few at 4% sparsity plus many GM cells at 10−5

[Waydo et al., J. Neurosci. 26, 10232 (2006)]

19/20

Page 20: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Summary and outlook

• New: Deduction of two classes of cell (in hippocampus etc)

– “Image processors”: relatively frequently firing. 0.2% of cells– “Memory cells”: Ultra-sparsely firing. > 99% of cells– Extreme bias against detection of memory cells.– Memories/components 105 or more.

• Removed textbook arguments against “local memory cells”

• (Generalized) GM cells for recognition and memories can be efficient

• General method for analysis of data with multiple cell populations, to compensateextreme detection bias against GM-like cells

• Future:

– Physics of detection– Extend to other data. Anatomy– Predictions for mechanisms and algorithms– Build explicit models– Bigger picture

20/20

Page 21: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Appendix

Page 22: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

References

• Hopfield model and developments: “Distributed memory”, attractor states:

– J.J. Hopfield, “Neural Networks and Physical Systems with Emergent CollectiveComputational Abilities”, Proc. Natl. Acad. Sci. 79, 2554-2558 (1982)

– J.J. Hopfield, http://arXiv.org/abs/q-bio.NC/0609006

• “Local”, “grandmother-cell” (GM) model for memory:

– E.B. Baum, J. Moody, and F. Wilczek, “Internal representations for associativememory”, Biol. Cybern. 59, 217–228 (1988)

• Our paper: J. Collins & D.-Z. Jin, “Grandmother cells and the storage capacity ofthe human brain”, http://arXiv.org/abs/q-bio.NC/0603014. Ver. 2

• Recent data (Halle-Berry cell, etc), and the experimentalists’ analysis:

– R. Quian Quiroga, L. Reddy, G. Kreiman, C. Koch, and I. Fried, “Invariantvisual representation by single neurons in the human brain”, Nature 435,1102–1107 (2005)

– S. Waydo, A. Kraskov, R Quian Quiroga, I. Fried, and C. Koch, “SparseRepresentation in the Human Medial Temporal Lobe”, J. Neurosci. 26,10232–10234 (2006)

• And references therein

Page 23: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Recent data: More from the Halle Berry cell

the false positive rate (x axis) the relative number of responses toother pictures. The ROC curve corresponds to the performance of alinear binary classifier for different values of a response threshold.Decreasing the threshold increases the probability of hits but also offalse alarms. A cell responding to a large set of pictures of differentindividuals will have a ROC curve close to the diagonal (with an areaunder the curve of 0.5), whereas a cell that responds to all pictures ofan individual but not to others will have a convex ROC curve far fromthe diagonal, with an area close to 1. In Fig. 1c we show the ROCcurve for all seven pictures of Jennifer Aniston (red trace, with an areaequal to 1). The grey lines show 99 ROC surrogate curves, testinginvariance to randomly selected groups of pictures (see Methods). Asexpected, these curves are close to the diagonal, having an area ofabout 0.5. None of the 99 surrogate curves had an area equal or largerthan the original ROC curve, implying that it is unlikely (P , 0.01)

that the responses to Jennifer Aniston were obtained by chance. Aresponsive unit was defined to have an invariant representation if thearea under the ROC curve was larger than the area of the 99 surrogatecurves.Figure 2 shows another single unit located in the right anterior

hippocampus of a different patient. This unit was selectively acti-vated by pictures of the actress Halle Berry as well as by a drawing ofher (but not by other drawings; for example, picture no. 87). Thisunit was also activated by several pictures of Halle Berry dressed asCatwoman, her character in a recent film, but not by other images ofCatwoman that were not her (data not shown). Notably, the unit wasselectively activated by the letter string ‘Halle Berry’. Such aninvariant pattern of activation goes beyond common visual featuresof the different stimuli. As with the previous unit, the responses weremainly localized between 300 and 600ms after stimulus onset.

Figure 2 | A single unit in the right anterior hippocampus that responds to

pictures of the actress Halle Berry (conventions as in Fig. 1).

a–c, Strikingly, this cell also responds to a drawing of her, to herself dressedas Catwoman (a recent movie in which she played the lead role) and to the

letter string ‘Halle Berry’ (picture no. 96). Such an invariant response cannotbe attributed to common visual features of the stimuli. This unit also had avery low baseline firing rate (0.06 spikes). The area under the red curve in c is0.99.

LETTERS NATURE|Vol 435|23 June 2005

1104

© 2005 Nature Publi shing Group

[From: Quian Quiroga et al, Nature 435, 1102 (2005)]

Page 24: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

How to measure distributed v. GM?

• Sparsity α of cell: fraction of stimuli it responds to. Often small

• One cell: Sample of p stimuli, n responses:

P (n, celli) = αni (1− αi)p−n p!

n! (p− n)!' (pαi)ne−pαi

1n!

• Sample of cells, with sample of p stimuli:

P (n|p) =∫ 1

0

dα D(α) αn(1− α)p−n p!n! (p− n)!

'∫ 1

0

dα D(α) (pα)ne−pα 1n!

=⇒ E.g.:

1 2 3 4 5 6 7 8 9 10

10

20

30

40

50

GM population: α ' 1/R,e.g., 10−5

Distributed population, e.g.,α ∼ several%

Page 25: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Quian Quiroga et al. method

• 8 patients @ 250 detected cells: Total 2000 cells

• Extracellular electrodes, sensitive to many cells; improved spike-sorting

• (Us: Multiply cells by K ∼ 30 for silent cell correction)

• I.e., 60 000 cells in range of electrodes

• Measurements:– Screening session (find responsive cells)

93.9 different images (familiar people, etc)

=⇒ 60 000× 93.9 ' 6× 106 trials

– Testing session (selectivity)Different views

Page 26: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Sample of images and cells (screening sessions)

Data [QQ et al, Nature 435, 1102 (2005)]

• 60 000 cells, 93.9 images

• 132 responsive cells– 51 respond to 1 image (of 93.9)– 81 respond to ≥ 2. Avg. 4.1

• Rest non-responsive or non-detected 1 2 3 4 5 6 7 8 9 10

10

20

30

40

50

Model: 3 fractions:

fD Distrib. Sparsity a

fGM GM-like Repertoire R

Rest Silent Non-responsive

Distributed rep. cells:

• Poisson response distribution

• Fit: sparsity a = 4%

81 + 8 + 2 = 91 cells

⇒ fD = 91/(60 000) ' 0.2%(!)

⇒ Rest (> 99%) of cells: GM or silent.

⇒ 51− 8 = 43 detected GM cells

Page 27: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

2-population analysis: GM-cell part

Data:

• 60 000 cells, 93.9 images

• 132 responsive cells– 51 respond to 1 image (of 93.9)– 81 respond to ≥ 2. Avg. 4.1

• Rest non-face (silent) 1 2 3 4 5 6 7 8 9 10

10

20

30

40

50

Model:

fD = 0.2% Distrib. Sparsity a = 4%

fGM GM-like Repertoire R

Rest Silent Non-responsive

43 GM cells:

• 43 =1R

fGM (93.9× 60 000)

⇒ R ∼ 105fGM

• I.e., up to 105 categories for classicGM cells

• Many more for memory cells a laBMW

Page 28: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Information in stimuli

Telephone numbers:

Input Storage

Arbitrary stimulus 1000 remembered

Info. current state 10 digits = 34 bits

Info. stored 1000× (34 + . . .) bits(with associations)

Neurons in GM rep. 1010 possible stimuli 1000 recognized stimuli

Neurons in distributed rep. ∼ 100

Synapses ' 1000× (34 + . . .) synapses

Page 29: Neural coding and information theory: Grandmother cells … · Neural coding and information theory: Grandmother cells v. distributed codes John Collins ... Recent data: Halle Berry

Naive estimate of bias in cell detection for earlier data

E.g., Abbott, Rolls & Tovee (1996):

• 14 face-responsive neurons, 20 face stimuli, monkeys

• Our first fit to 2005 human data, blindly applied, postdicts detected GM cells

Popn. frac. QQ (2005) ART (1996)

detected frac. det. # det. frac. det. #

D5%K

4.5%K

= 0.15% 892.5%K

= 0.08% 14− GM#

GM fGM ≤ 99%93.9 fGM

4400 K fGM= 0.07% 43

204400 K

= 0.015% ⇒ 2

• ART (1996): “two . . . cells showed ‘grandmother’-like responses”

• In prediction of bias, value of fGM and number of extra very-silent cells cancel

• Cell-number ratios indt. of K and fGM