a cortical model for learning complex temporal structure ... · apr 01 apr 08 apr 15 apr 22 apr 29...
TRANSCRIPT
Experimental support for active dendrites: Poirazi et al., 2003; Palmer et al., 2014; Major, Larkum and Schiller 2013
Sequence learning is ubiquitous in cortex
What is neural mechanism for sequence learning?
HTM sequence memory:1. Neurons learn to recognize hundreds of patterns using
active dendrites.
2. Recognition of patterns act as predictions by depolarizing the cell without generating an immediate action potential.
3. A network of neurons with active dendrites forms a powerful sequence memory.
Active Dendrites Pattern detectors Each cell can recognize 100’s of unique patterns
Feedforward 100’s of synapses “Classic” receptive field
Context 1,000’s of synapses Depolarize neuron “Predicted state”
Pyramidal neuron HTM neuron
Proximal dendrite can recognize a feedforward input pattern and activate cell
4. Sparse representations lead to highly robust recognition.
HTM neuron model:
Each distal dendrite segment can recognize a particular pattern and cause dendritic spike
5. Agrees well with experimental evidence.
A cortical model for learning complex temporal structure in sensory streams Subutai Ahmad, Yuwei Cui, and Jeff Hawkins[sahmad,ycui,jhawkins]@numenta.com
HTM network model for sequence learning
Activation rules Select the top 2% of columns with strongest inputs on proximal dendrite as active columns
If any cell in an active column is predicted, only the predicted cells fire
If no cell in an active column is predicted, all cells in the column fire
: If a depolarized cell becomes active subsequently, its active dendritic segment will be reinforced If a depolarized cell does not become active, we apply a small decay to active segments of that cell If no cell in an active column is predicted, the cell with the most activated segment gets reinforced
Learning and activation rules
(Hawkins and Ahmad, 2016)
Unsupervised Hebbian-like learning rules
Detected pattern on distal dendrite causes cell to be depolarized (predicted)
X
A B
B
C
C
D
Y
Before learning
X B’’ C’’
D’
Y’’
After learning
A B’ C’
Same columns, but only one cell active per column after learning.
Time
Active cells
Depolarized (predictive) cells
Inactive cellsLearns complex high-order sequencese.g. ABCD vs XBCY
Visual StimulusScreen
Trial 1 Trial 20
Time (sec)5 10 15 20 25 300
50
100
150
200
250
300
350
Time (sec)5 10 15 20 25 30
Neu
ron
#
0
50
100
150
200
250
300
350
Time (sec)5 10 15 20 25 30
Neu
ron
#
0
40
80
120
160
Time (sec)5 10 15 20 25 300
40
80
120
160
V1
AL
Trial Number0 5 10 15 20
Activ
atio
n D
ensi
ty
x10-3
3456789
101112
V1AL
Cell assembly order3 4 5 6N
umbe
r of u
niqu
e ce
ll as
sem
blie
s x104
1
3
5
7
Cell assembly order3 4 5 6
0
1000
2000
3000 DataShuffledPoisson
ALV1
ALV13-order assemblysingle cell
-1 -0.5 0 0.5 10
0.1
0.2
0.3
-1 -0.5 0 0.5 10
0.020.040.060.08
Time jitter (sec)
Pro
b. o
f rep
eate
dce
ll as
sem
bly
Time jitter (sec)
Pro
b. o
f rep
eate
dce
ll as
sem
bly
Example population responses
Naturalistic Movie
HTM sequence memory predicts structured sparser activity for learned sequence
Data: Sparser activity over repeated presentations
0 10 20 30
Tria
l
0
10
20
Responses of three cells
0 10 20 30
Shuffled Spikes
0 10 20 30
Tria
l
0
10
20Cell Assembly
0 10 20 30
Cell Assembly with shuffled spikes
Time (s) Time (s)
Subs
et in
dex
0
0.01
0.02
0.03
0.04
AL
Data
Shuffle
d
Poisso
n
Subs
et in
dex
0
0.02
0.04
0.06
0.08 V1
Data
Shuffle
d
Poisso
n
Wide FOV calcium imaging
Active cell # on both first trial and last trial
Active cell # on last trial
Data: Presence of high-order cell assemblies
Cell assemblies are locked to stimulus
Testable predictions and experimental validation
HTM sequence memory predicts presence of high-order cell assemblies
Data: sparse activity is a subset of activity before learning
(Stirman et al., 2016)
(In collaboration with Spencer Smith and Yiyi Yu, UNC, Chapel Hill)
ReferencesCui Y, Ahmad S, Hawkins J. Neural Computation. (2016) 28: 2474-2504Hawkins J, Ahmad S, Front. Neural Circuits. 10 (2016) 1–13.Major, G., Larkum, M. E., and Schiller, J. (2013). Annu. Rev. Neurosci. 36, 1–24. Miller JK, Ayzenshtat I, Carrillo-Reid L, Yuste R (2014). PNAS 111(38):E4053-61Piuri, V., J Parallel Distr Com., vol. 61, pp. 18-48.Poirazi P, Brannon T, Mel BW (2003) Neuron 37:989–999.Stirman, J. N., Smith, I. T., Kudenov, M. W. & Smith, S. L. (2016) Nat. Biotechnol. 34, 857–862.
Mea
n ab
solu
te p
erce
nt e
rror
Shift
ARIMAESN
LSTM10
00
LSTM30
00
LSTM60
00HTM
0.00
0.10
0.20
0.30
2015-04-20 Monday
2015-04-21Tuesday
2015-04-22Wednesday
2015-04-23 Thursday
2015-04-24 Friday
2015-04-25 Saturday
2015-04-26 Sunday
0k
5k
10k
15k
20k
25k
Pas
seng
er C
ount
Data sources: http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml
Task: predict taxi passenger count in NYCHTM has comparable performance to state-of-the-art algorithms
HTM exhibits high fault tolerance to neuron death
0.0 0.1 0.2 0.3 0.4 0.5 0.6Fraction of cell death
0.2
0.4
0.6
0.8
1.0
Acc
urac
y af
ter c
ell d
eath
HTMLSTM HTM is fault tolerant due to properties of
sparse distributed representations (Hawkins & Ahmad 2016).
In contrast, LSTM and most other artificial neural networks are sensitive to loss of neurons or synapses (Piuri 2001).
Apr 01 Apr 08 Apr 15 Apr 22 Apr 29 May 06
LSTM6000
HTM
0.10
0.15
0.20
0.25
0.30
0.35
Mea
n ab
solu
te p
erce
nt e
rror
20% increase in weekday night traffic20% decrease in weekday morning traffic
HTM adapts quickly to changes
Injected change to the data
HTM works well on real-world problems
HTM adapts quickly to changes in statistics due to its continuous unsupervised Hebbian learning rule.
(Cui et al., 2016)