justin dauwels lids, mit lmi, harvard medical school
DESCRIPTION
Machine learning techniques for quantifying neural synchrony: application to the diagnosis of Alzheimer's disease from EEG. Justin Dauwels LIDS, MIT LMI, Harvard Medical School Amari Research Unit, Brain Science Institute, RIKEN June 9 , 2008. RIKEN Brain Science Institute. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/1.jpg)
Machine learning techniques for quantifying neural synchrony: application to the diagnosis of Alzheimer's disease from EEG
Justin DauwelsLIDS, MIT
LMI, Harvard Medical SchoolAmari Research Unit, Brain Science Institute, RIKEN
June 9, 2008
![Page 2: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/2.jpg)
RIKEN Brain Science Institute• RIKEN Wako Campus (near Tokyo)
• about 400 researchers and staff (20% foreign)
• 300 research fellows and visiting scientists
• about 60 laboratories
• research covers most aspects of brain science
CollaboratorsFrançois Vialatte*, Theo Weber+, Shun-ichi Amari*, Andrzej Cichocki* (*RIKEN, +MIT)
ProjectEarly diagnosis of Alzheimer’s disease based on EEG
Financial Support
![Page 3: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/3.jpg)
Research Overview
• EEG (RIKEN, MIT, MGH)• diagnosis of Alzheimer’s disease• detection/prediction of epileptic seizures• analysis of EEG evoked by visual/auditory stimuli• EEG during meditation• projects related to brain-computer interface (BMI)
• Calcium imaging (RIKEN, NAIST, MIT)• effect of calcium on neural growth• role of calcium propagation in gliacells and neurons
• Diffusion MRI (Brigham&Women’s Hospital, Harvard Medical School, MIT)
• estimation and clustering of tracts (future project)
Machine learning & signal processing for applications in NEUROSCIENCE = development of ALGORITHMS to analyze brain signals
subject of this talk
![Page 4: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/4.jpg)
Overview
Alzheimer’s Disease (AD) EEG of AD patients: decrease in synchrony Synchrony measure in time-frequency domain
Pairs of EEG signalsCollections of EEG signals
Numerical Results Outlook
![Page 5: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/5.jpg)
Alzheimer's diseaseOutside glimpse: clinical perspective
• Mild (early stage)- becomes less energetic or spontaneous- noticeable cognitive deficits- still independent (able to compensate)
• Moderate (middle stage)- Mental abilities decline- personality changes- become dependent on caregivers
• Severe (late stage)- complete deterioration of the personality- loss of control over bodily functions- total dependence on caregivers
Apathy
Memory(forgettingrelatives)
Evolution of the disease (stages)One disease,
many symptoms
Loss ofSelf-control
Video sources: Alzheimer society
• 2 to 5 years before- mild cognitive impairment (often unnoticed)- 6 to 25 % progress to Alzheimer's per year
memory, language, executive functions, apraxia, apathy, agnosia, etc…
• 2% to 5% of people over 65 years old• up to 20% of people over 80 Jeong 2004 (Nature)
EEG data
![Page 6: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/6.jpg)
Alzheimer's diseaseInside glimpse: brain atrophy
Video source: P. Thompson, J.Neuroscience, 2003
Images: Jannis Productions.(R. Fredenburg; S. Jannis)
amyloid plaques andneurofibrillary tangles
Video source: Alzheimer society
![Page 7: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/7.jpg)
Overview
Alzheimer’s Disease (AD) EEG of AD patients: decrease in synchrony Synchrony measure in time-frequency domain
Pairs of EEG signalsCollections of EEG signals
Numerical Results Outlook
![Page 8: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/8.jpg)
Alzheimer's diseaseInside glimpse: abnormal EEG
• AD vs. MCI (Hogan et al. 203; Jiang et al., 2005)• AD vs. Control (Hermann, Demilrap, 2005, Yagyu et al. 1997; Stam et al., 2002; Babiloni et al. 2006)• MCI vs. mildAD (Babiloni et al., 2006).
Decrease of synchrony
Brain “slow-down”slow rhythms (0.5-8 Hz) fast rhythms (8-30 Hz)
(Babiloni et al., 2004; Besthorn et al., 1997; Jelic et al. 1996, Jeong 2004; Dierks et al., 1993).
Images: www.cerebromente.org.br
EEG system: inexpensive, mobile, useful for screening
focus of this project
![Page 9: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/9.jpg)
Spontaneous (scalp) EEG
Fourier power
f (Hz)
t (sec)
ampl
itude
Fourier |X(f)|2
EEG x(t)
Time-frequency |X(t,f)|2(wavelet transform)
Time-frequency patterns(“bumps”)
![Page 10: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/10.jpg)
Fourier transform
High frequency
Low frequency
Frequency
1 23
2
13
![Page 11: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/11.jpg)
Windowed Fourier transform
* =Fourier basis functions Window
function windowed basis functions
WindowedFourierTransform
t
f
![Page 12: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/12.jpg)
Spontaneous EEG
Fourier power
f (Hz)
t (sec)
ampl
itude
Fourier |X(f)|2
EEG x(t)
Time-frequency |X(t,f)|2(wavelet transform)
Time-frequency patterns(“bumps”)
![Page 13: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/13.jpg)
Signatures of local synchronyf (Hz)
t (sec)
Time-frequency patterns(“bumps”)
EEG stems from thousands of neuronsbump if neurons are phase-locked= local synchrony
![Page 14: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/14.jpg)
Alzheimer's diseaseInside glimpse: abnormal EEG
• AD vs. MCI (Hogan et al. 203; Jiang et al., 2005)• AD vs. Control (Hermann, Demilrap, 2005, Yagyu et al. 1997; Stam et al., 2002; Babiloni et al. 2006)• MCI vs. mildAD (Babiloni et al., 2006).
Decrease of synchrony
Brain “slow-down”slow rhythms (0.5-8 Hz) fast rhythms (8-30 Hz)
(Babiloni et al., 2004; Besthorn et al., 1997; Jelic et al. 1996, Jeong 2004; Dierks et al., 1993).
Images: www.cerebromente.org.br
EEG system: inexpensive, mobile, useful for screening
focus of this project
![Page 15: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/15.jpg)
Overview
Alzheimer’s Disease (AD) EEG of AD patients: decrease in synchrony Synchrony measure in time-frequency domain
Pairs of EEG signalsCollections of EEG signals
Numerical Results Outlook
![Page 16: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/16.jpg)
Comparing EEG signal rhythms ?
PROBLEM I:
Signals of 3 seconds sampled at 100 Hz ( 300 samples)Time-frequency representation of one signal = about 25 000 coefficients
2 signals
![Page 17: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/17.jpg)
Numerous neighboring pixels
Comparing EEG signal rhythms ?(2)
One pixel
PROBLEM II:
Shifts in time-frequency!
![Page 18: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/18.jpg)
Sparse representation: bump model
Assumptions:
1. time-frequency map is suitable representation
2. oscillatory bursts (“bumps”) convey key information
Bumps
Sparse representation
F. Vialatte et al. “A machine learning approach to the analysis of time-frequency maps and its application to neural dynamics”, Neural Networks (2007).
Normalization:
104- 105 coefficients
about 102 parameters
t (sec)
f(Hz)
f(Hz)
t (sec)
f(Hz)
t (sec)
![Page 19: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/19.jpg)
Similarity of bump models...
How “similar” or “synchronous” are two bump models?= GLOBAL synchrony
Reminder: bumps due to LOCAL synchrony= MULTI-SCALE approach
![Page 20: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/20.jpg)
... by matching bumpsy1 y2 Some bumps match
Offset between matched bumps
SIMILAR bump models if:Many matchesStrongly overlapping matches
![Page 21: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/21.jpg)
... by matching bumps (2)
• Bumps in one model, but NOT in other → fraction of “spurious” bumps ρspur
• Bumps in both models, but with offset → Average time offset δt (delay) → Timing jitter with variance st
→ Average frequency offset δf → Frequency jitter with variance sf
Synchrony: only st and ρspur relevant
PROBLEM: Given two bump models, compute (ρspur, δt, st, δf, sf )
Stochastic Event Synchrony (SES) = (ρspur, δt, st, δf, sf )
![Page 22: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/22.jpg)
Overview
Alzheimer’s Disease (AD) EEG of AD patients: decrease in synchrony Synchrony measure in time-frequency domain
Pairs of EEG signalsCollections of EEG signals
Numerical Results Outlook
![Page 23: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/23.jpg)
Average synchrony
3. SES for each pair of models4. Average the SES parameters
1. Group electrodes in regions2. Bump model for each region
![Page 24: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/24.jpg)
Beyond pairwise interactions...Pairwise similarity Multi-variate similarity
![Page 25: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/25.jpg)
...by clusteringy1 y2 y3 y4 y5
y1 y2 y3 y4 y5
Constraint: in each cluster at most one bump from each signal
Models similar if• few deletions/large clusters• little jitter
HARD combinatorial problem!
![Page 26: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/26.jpg)
Overview
Alzheimer’s Disease (AD) EEG of AD patients: decrease in synchrony Synchrony measure in time-frequency domain
Pairs of EEG signalsCollections of EEG signals
Numerical Results Outlook
![Page 27: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/27.jpg)
EEG Data
EEG data provided by Prof. T. Musha
• EEG of 22 Mild Cognitive Impairment (MCI) patients and 38 age-matched control subjects (CTR) recorded while in rest with closed eyes → spontaneous EEG
• All 22 MCI patients suffered from Alzheimer’s disease (AD) later on
• Electrodes located on 21 sites according to 10-20 international system
• Electrodes grouped into 5 zones (reduces number of pairs) 1 bump model per zone
• Used continuous “artifact-free” intervals of 20s
• Band pass filtered between 4 and 30 Hz
![Page 28: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/28.jpg)
Similarity measures• Correlation and coherence• Granger causality (linear system): DTF, ffDTF, dDTF, PDC, PC, ...
• Phase Synchrony: compare instantaneous phases (wavelet/Hilbert transform)
• State space based measures sync likelihood, S-estimator, S-H-N-indices, ...
• Information-theoretic measures KL divergence, Jensen-Shannon divergence, ...
No Phase Locking Phase Locking
TIME FREQUENCY
![Page 29: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/29.jpg)
![Page 30: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/30.jpg)
Sensitivity (average synchrony)
Granger
Info. Theor.
State Space
Phase
SES
Corr/Coh
Mann-Whitney test: small p value suggests large difference in statistics of both groups
Significant differences for ffDTF and ρ!
![Page 31: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/31.jpg)
Classification
• Clear separation, but not yet useful as diagnostic tool• Additional indicators needed (fMRI, MEG, DTI, ...)• Can be used for screening population (inexpensive, simple, fast)
ffDTF
![Page 32: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/32.jpg)
Strong (anti-) correlations „families“ of sync measures
Correlations
![Page 33: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/33.jpg)
Overview
Alzheimer’s Disease (AD) EEG of AD patients: decrease in synchrony Synchrony measure in time-frequency domain
Pairs of EEG signalsCollections of EEG signals
Numerical Results Outlook
![Page 34: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/34.jpg)
Ongoing work Time-varying similarity parameters
st
low st high sthigh st
no stimulus no stimulusstimulus
low st high sthigh st
![Page 35: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/35.jpg)
Future work Matching event patterns instead of single events
= allows us to extract patterns in time-frequency map of EEG!
HYPOTHESIS:Perhaps specific patterns occur in time-frequency EEG maps of AD patients before onset of epileptic seizures
REMARK:Such patterns are ignored by classical approaches: STATIONARITY/AVERAGING!
coupling betweenfrequency bands
t (sec)
f(Hz)
![Page 36: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/36.jpg)
Conclusions
Measure for similarity of point processes („stochastic event synchrony“)
Key idea: alignment of events
Solved by statistical inference
Application: EEG synchrony of MCI patients
About 85% correctly classified; perhaps useful for screening population
Ongoing/future work: time-varying SES, extracting patterns of bumps
![Page 37: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/37.jpg)
References + softwareReferences
Quantifying Statistical Interdependence by Message Passing on Graphs: Algorithms and Application to Neural Signals, Neural Computation (under revision)
A Comparative Study of Synchrony Measures for the Early Diagnosis of Alzheimer's Disease Based on EEG, NeuroImage (under revision)
Measuring Neural Synchrony by Message Passing, NIPS 2007
Quantifying the Similarity of Multiple Multi-Dimensional Point Processes by Integer Programming with Application to Early Diagnosis of Alzheimer's Disease from EEG, EMBC 2008 (submitted)
Software
MATLAB implementation of the synchrony measures
![Page 38: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/38.jpg)
Machine learning techniques for quantifying neural synchrony: application to the diagnosis of Alzheimer's disease from EEG
Justin DauwelsLIDS, MIT
LMI, Harvard Medical SchoolAmari Research Unit, Brain Science Institute, RIKEN
June 9, 2008
![Page 39: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/39.jpg)
Machine learning for neuroscience
Multi-scale in time and space
Data fusion: EEG, fMRI, spike data, bio-imaging, ...
Large-scale inference
Visualization
Behavior ↔ Brain ↔ Brain Regions ↔ Neural Assemblies ↔ Single neurons ↔ Synapses ↔ Ion channels
![Page 40: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/40.jpg)
Estimation
Deltas: average offset Sigmas: var of offset
...where
Simple closed form expressions
artificial observations (conjugate prior)
![Page 41: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/41.jpg)
Large-scale synchrony
Apparently, all brain regions affected...
![Page 42: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/42.jpg)
Alzheimer's disease
1980 1990 2000 2010 2020 2030 2040 20500
2
4
6
8
10
12
14
Outside glimpse: the future (prevalence)
USA (Hebert et al. 2003)
2000 2030 20500
20
40
60
80
100
120
Developped countriesDevelopping countries
World (Wimo et al. 2003)
Mill
ion
of s
uffe
rers
Mill
ion
of s
uffe
rers
• 2% to 5% of people over 65 years old
• Up to 20% of people over 80
Jeong 2004 (Nature)
![Page 43: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/43.jpg)
Ongoing and future workApplications
alternative inference techniques (e.g., MCMC, linear programming) time dependent (Gaussian processes) multivariate (T.Weber)
Fluctuations of EEG synchrony Caused by auditory stimuli and music (T. Rutkowski) Caused by visual stimuli (F. Vialatte) Yoga professionals (F. Vialatte) Professional shogi players (RIKEN & Fujitsu) Brain-Computer Interfaces (T. Rutkowski)
Spike data from interacting monkeys (N. Fujii) Calcium propagation in gliacells (N. Nakata) Neural growth (Y. Tsukada & Y. Sakumura) ...
Algorithms
![Page 44: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/44.jpg)
Fitting bump models
Signal
Bump
Initialisation After adaptationAdaptation
gradient method
F. Vialatte et al. “A machine learning approach to the analysis of time-frequency maps and its application to neural dynamics”, Neural Networks (2007).
![Page 45: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/45.jpg)
Boxplots
SURPRISE!No increase in jitter, but significantly less matched activity!
Physiological interpretation• neural assemblies more localized?• harder to establish large-scale synchrony?
![Page 46: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/46.jpg)
Similarity of bump models...
How “similar” or “synchronous” are two bump models?
![Page 47: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/47.jpg)
POINT ESTIMATION: θ(i+1) = argmaxx log p(y, y’, c(i+1) ,θ )
Uniform prior p(θ): δt, δf = average offset, st, sf = variance of offset Conjugate prior p(θ): still closed-form expressionOther kind of prior p(θ): numerical optimization (gradient method)
Probabilistic inference
![Page 48: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/48.jpg)
MATCHING: c(i+1) = argmaxc log p(y, y’, c, θ(i) )
ALGORITHMS
• Polynomial-time algorithms gives optimal solution(s) (Edmond-Karp and Auction algorithm)• Linear programming relaxation: extreme points of LP polytope are integral• Max-product algorithm gives optimal solution if unique [Bayati et al. (2005), Sanghavi (2007)]
EQUIVALENT to (imperfect) bipartite max-weight matching problem
c(i+1) = argmaxc log p(y, y’, c, θ(i) ) = argmaxc Σkk’ wkk’(i) ckk’
s.t. Σk’ ckk’ ≤ 1 and Σk ckk’ ≤ 1 and ckk’ 2 {0,1}
Probabilistic inference
not necessarily perfectfind heaviest set of disjoint edges
![Page 49: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/49.jpg)
p(y, y’, c, θ) / I(c) pθ(θ) Πkk’ (N(t k’ – tk ; δt ,st,kk’) N(f k’ – fk ; δf ,sf, kk’) β-2)ckk’
Max-product algorithmMATCHING: c(i+1) = argmaxc log p(y, y’, c, θ(i) )
Generative model
![Page 50: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/50.jpg)
Max-product algorithmMATCHING: c(i+1) = argmaxc log p(y, y’, c, θ(i) )
μ↑μ↑
μ↓ μ↓
Conditioning on θ
![Page 51: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/51.jpg)
Max-product algorithm (2)• Iteratively compute messages
• At convergence, compute marginals p(ckk’) = μ↓(ckk’) μ↓(ckk’) μ↑(ckk’)• Decisions: c*kk’ = argmaxckk’
p(ckk’)
![Page 52: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/52.jpg)
Algorithm
MATCHING → max-productESTIMATION → closed-form
PROBLEM: Given two bump models, compute (ρspur, δt, st, δf, sf )
APPROACH: (c*,θ*) = argmaxc,θ log p(y, y’, c, θ)
θ
SOLUTION: Coordinate descent
c(i+1) = argmaxc log p(y, y’, c, θ(i) ) θ(i+1) = argmaxx log p(y, y’, c(i+1) ,θ )
![Page 53: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/53.jpg)
Generative modelGenerate bump model (hidden)
• geometric prior for number n of bumps p(n) = (1- λ S) (λ S)-n
• bumps are uniformly distributed in rectangle
• amplitude, width (in t and f) all i.i.d.
Generate two “noisy” observations
• offset between hidden and observed bump = Gaussian random vector with mean ( ±δt /2, ±δf /2) covariance diag(st/2, sf /2)
• amplitude, width (in t and f) all i.i.d.
• “deletion” with probability pd
yhidden
y y’
Easily extendable to more than 2 observations…
( -δt /2, -δf /2)
( δt /2, δf /2)
![Page 54: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/54.jpg)
Generative model (2)
• Binary variables ckk’
ckk’ = 1 if k and k’ are observations of same hidden bump, else ckk’ = 0 (e.g., cii’ = 1 cij’ = 0)
• Constraints: bk = Σk’ ckk’ and bk’ = Σk ckk’ are binary (“matching constraints”)
• Generative Model p(y, y’, yhidden , c, δt , δf , st , sf ) (symmetric in y and y’)
• Eliminate yhidden → offset is Gaussian RV with mean = ( δt , δf ) and covariance diag (st , sf)
• Probabilistic Inference:(c*,θ*) = argmaxc,θ log p(y, y’, c, θ)
y y’
( -δt /2, -δf /2)
( δt /2, δf /2)
i
i’ j’
p(y, y’, c, θ) = ∫ p(y, y’, yhidden , c, θ) dyhidden
θ
![Page 55: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/55.jpg)
• Bumps in one model, but NOT in other → fraction of “spurious” bumps ρspur
• Bumps in both models, but with offset → Average time offset δt (delay) → Timing jitter with variance st
→ Average frequency offset δf → Frequency jitter with variance sf
PROBLEM: Given two bump models, compute (ρspur, δt, st, δf, sf )
APPROACH: (c*,θ*) = argmaxc,θ log p(y, y’, c, θ)θ
Summary
![Page 56: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/56.jpg)
Objective function
• Logarithm of model: log p(y, y’, c, θ) = Σkk’ wkk’ ckk’ + log I(c) + log pθ(θ) + γ
wkk’ = -(1/st (t k’ – tk – δt)2 + 1/sf (f k’ – fk– δf)2 ) - 2 log β
β = pd (λ/V)1/2
Euclidean distance between bump centers
• Large wkk’ if : a) bumps are close b) small pd c) few bumps per volume element
• No need to specify pd , λ, and V, they only appear through β = knob to control # matches
y y’
( -δt /2, -δf /2)
( δt /2, δf /2)
i
i’ j’
![Page 57: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/57.jpg)
Distance measures
wkk’ = 1/st,kk’ (t k’ – tk – δt)2 + 1/sf,kk’ (f k’ – fk– δf)2 + 2 log β
st,kk’ = (Δtk + Δt’k) st sf,kk’ = (Δfk + Δf’k) sf
Scaling
Non-Euclidean
![Page 58: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/58.jpg)
p(y, y’, c, θ) / I(c) pθ(θ) Πkk’ (N(t k’ – tk ; δt ,st,kk’) N(f k’ – fk ; δf ,sf, kk’) β-2)ckk’
Generative model
![Page 59: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/59.jpg)
Expect bumps to appear at about same frequency, but delayed Frequency shift requires non-linear transformation, less likely than delay Conjugate priors for st and sf (scaled inverse chi-squared):
Improper prior for δt and δt : p(δt) = 1 = p(δf)
Prior for parameters
![Page 60: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/60.jpg)
CTR
MCI
Preliminary results for multi-variate modellinear comb of pc
![Page 61: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/61.jpg)
Probabilistic inference
MATCHINGPOINT ESTIMATION
PROBLEM: Given two bump models, compute (ρspur, δt, st, δf, sf )
APPROACH: (c*,θ*) = argmaxc,θ log p(y, y’, c, θ)
θ
SOLUTION: Coordinate descent
c(i+1) = argmaxc log p(y, y’, c, θ(i) ) θ(i+1) = argmaxx log p(y, y’, c(i+1) ,θ )
X
Y
Minx2 X, y2Y d(x,y)
![Page 62: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/62.jpg)
Generative modelGenerate bump model (hidden)
• geometric prior for number n of bumps p(n) = (1- λ S) (λ S)-n
• bumps are uniformly distributed in rectangle
• amplitude, width (in t and f) all i.i.d.
Generate M “noisy” observations
• offset between hidden and observed bump = Gaussian random vector with mean ( δt,m /2, δf,m /2) covariance diag(st,m/2, sf,m /2)
• amplitude, width (in t and f) all i.i.d.
• “deletion” with probability pd
(other prior pc0 for cluster size)
yhidden
y1 y2 y3 y4 y5
Parameters: θ = δt,m , δf,m , st,m , sf,m, pc
pc (i) = p(cluster size = i |y) (i = 1,2,…,M)
![Page 63: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/63.jpg)
(Hebb 1949, Fuster 1997)
Stimuli Consolidation Stimulus
Voice Face Voice
Role of local synchrony
Assembly activation Hebbian consolidationAssembly recall
![Page 64: Justin Dauwels LIDS, MIT LMI, Harvard Medical School](https://reader035.vdocuments.site/reader035/viewer/2022062501/568163a5550346895dd4ab2d/html5/thumbnails/64.jpg)
Probabilistic inference
CLUSTERING (IP or MP)POINT ESTIMATION
PROBLEM: Given M bump models, compute θ = δt,m , δf,m , st,m , sf,m, pc
APPROACH: (c*,θ*) = argmaxc,θ log p(y, y’, c, θ)
SOLUTION: Coordinate descent
c(i+1) = argmaxc log p(y, y’, c, θ(i) ) θ(i+1) = argmaxx log p(y, y’, c(i+1) ,θ )
Integer program• Max-product algorithm (MP) on sparse graph• Integer programming methods (e.g., LP relaxation)