classifying event-related desynchronization in eeg, ecog, and meg signals kim sang-hyuk
TRANSCRIPT
Classifying Event-Related Desynchronization in EEG, ECoG, and
MEG Signals
Kim Sang-Hyuk
BioimagiBioimagingng
• Introduction
• Experimental setup and procedure
• Preanalysis
• Data processing
• Generalization error estimation
Contents
BioimagiBioimagingng
Introduction
• Several different technologies exist for measuring brain activity– They have their own advantages and limitations
– Spatial and temporal resolution
– Cost, portability and risk to the user
• Comparative studies are required in order to guide
• Motor-imagery BCI experiments based on Electroencephalography (EEG), electrocorticography (ECoG) and magnetoencephalography (MEG)
• A simple binary synchronous (trial-based) paradigm
• Present quantitative results focusing on– The effect of the number of trial
– The effect of spatial filtering
BioimagiBioimagingng
Introduction
• EEG– Electrical signals are measured by passive electrodes
– Very high temporal resolution
– Low cost, risk, and portability
– Limitation of spatial resolution
• ECoG– Electrical signals obtained from an array of electrodes beneath the skull
– High SNR
– A better response at higher frequencies
– Invasive
• MEG– Measuring the tiny magnetic field fluctuations induced by the electrical activity of
cerebral neurons
– Expensive and nonportable
BioimagiBioimagingng
Experimental Setup and Procedure
• EEG– 8 untrained right handed male subjects
– 39 silver chloride electrodes
– Sampling frequency: 256Hz
– The subjects were seat in an armchair at 1-m distance in front of a computer screen
: used for data acquisition
: reference
Positions of electrodes
BioimagiBioimagingng
Experimental Setup and Procedure
• Each trial started with a blank screen
• A small fixation cross displayed in the center of the screen from second 2 to 9
• At 2s, a short warning tone (beep)
• At 3s, the fixation cross was overlaid with an arrow at the center of the monitor for 1.5s– The direction of arrow point either to the left or to the right
• In order to avoid event related signals in later processing stages, only data from seconds 4 to 9 of each trial was considered
BioimagiBioimagingng
Preanalysis
• In order to identify and exclude subjects that did not show significant μ-activity at all
• Restricted to only the 17 EEG channels that were located over or close to the motor cortex– Calculate of the μ-band using the Welch method (short time Fourier transform) for
each subject
• This feature extraction resulted in one parameter per trial and channel
• The eight data sets consisting of the Welch-features were classified with linear support vector machines including individual model selection for each subject
• Generalization errors were estimated by 10-fold cross validation (CV)
• For three subjects the preanalysis showed very poor error rates close to chance level, their data sets were excluded from further analysis
BioimagiBioimagingng
Preanalysis
Short Time Fourier Transform (STFT)
• A Fourier-related transformation used to examine the frequency and phase content of local sections of a signal over time
• Discrete-time STFT
– W[n] is the window function
– Window is sliding along time axis
Examples of window overlap
BioimagiBioimagingng
Preanalysis
Short Time Fourier Transform (STFT)
Examples of STFT
BioimagiBioimagingng
Short Time Fourier Transform (STFT)
• 5 segment for a trial, overlapping 50%
• Averaging the spectra of 5
• A vector of log amplitudes at different frequencies for each sensor
Preanalysis
55
A trial
55
A vector
Averaging
BioimagiBioimagingng
Autoregressive (AR) Model
• AR(p) model is defined as
– Where are the parameters of the model
– P is order
• The output is modeled as a linear combination of P past values of the output
• For the remaining five subjects, the recorded 5s windows of each trial resulted in a time series of 1280 sample points per channel
• AR model of order 3 is fitted to the time series of all 39 channels using forward backward linear prediction– The three resulting coefficients per channel and trial formed the new
representation of the data
– The extraction of the features did not explicitly incorporate prior knowlede
– They are not directly linked to the μ-rhythm
Data Preprocessing
BioimagiBioimagingng
Support Vector Machine
Linear Support Vector Machine
• Choose a decision boundary between classes such that margin is maximized– Margin: the distance in feature space between the boundary and the nearest
data points (support vectors)
Linearly separable case
BioimagiBioimagingng
Support Vector Machine
Linear Support Vector Machine
• The function of hyperplane
– : weight vector normal to hyperplane
– : threshold
• The distance of a point from a hyperplane
0( ) 0Tg x w x w
w
( )g xd
w
0w
BioimagiBioimagingng
Support Vector Machine
Linear Support Vector Machine
• Scale so that the value of , at the support vectors, is equal to 1 for S1 and equal to -1 for S2
– Margin:
–
–
• Compute the parameters , of the hyperplane so that to:– Minimize
– Subject to to where corresponding class indicator (+1 for , -1 for )
w ( )g x
2
w
S2
S1
0 11,Tw x w x S
0 21,Tw x w x S
w 0w21
( )2
J w w
0( ) 1, 1,2,...,Tiy w x w i N iyw 0w
BioimagiBioimagingng
Support Vector Machine
Linear Support Vector Machine
• The Karush-Kuhn-Tucker (KKT) conditions
– is the vector of the Lagrange multipliers
– is the Lagrangian function defined as
• Finally results are
0( , , ) 0L w ww
00
( , , ) 0L w ww
0, 1,2,...,i i N
0[ ( ) 1] 0, 1,2,...,Ti iy w x w i N
0 01
1( , , ) [ ( ) 1]
2
NT T
i ii
L w w w w y w x w
0( , , )L w w
1
N
i i ii
w y x
1
0N
i ii
y
BioimagiBioimagingng
Support Vector Machine
Soft Margin Support Vector Machine
• In the case where the classes are not separable, soft margin support vector machine is available
• The training feature vectors categorized into three cases– Vectors that fall outside the band and are correctly classified
– Vectors falling inside the band and which are correctly classified
– Vectors that are misclassified00 ( ) 1T
iy w x w
0( ) 0Tiy w x w
BioimagiBioimagingng
Support Vector Machine
Soft Margin Support Vector Machine
• All three cases can be treated under a single type of constraints
– The first category of data:
– The second:
– The third:
• The goal is to make the margin as large as possible but at the same time to keep the number of points with as small as possible
• Cost function
– Where is the vector of the parameters
0( ) 1Ti iy w x w
0i
0 1i 1i
2
01
1( , , ) ( )
2
N
ii
J w w w C I
i
1 0( )
0 0i
ii
I
0i
BioimagiBioimagingng
Support Vector Machine
Soft Margin Support Vector Machine
• The parameter C is a positive constant that controls the relative influence of the two competing terms
• Optimization of the cost function is difficult due to a discontinuous function– A closely related cost function
– Minimize
– Subject to
• Depending on C, the optimal margin will widen and more points will become support vectors– Finding a good value for C is part of the model selection procedure
2
01
1( , , )
2
N
ii
J w w w C
0[ ] 1 , 1,2,...,
0, 1,2,...,
Ti i
i
y w x w i N
i N
BioimagiBioimagingng
Generalization Error Estimation
K-Fold Cross Validation
• A statistical method for validating a predictive model
• Whole data is separated into k subsets (folds) of equal size
• Each fold is also divided into k subsets and k subsets are categorized into train set and test set– K-1 subsets are used for training of classifier
– 1 set is used for validation
• Model training and evaluation is repeated k times with each of the k subsets
An example of 5-fold cross-validation
BioimagiBioimagingng
Contents of Next Lecture
• Feature Selection Method– Fisher criterion
– Zero-norm optimization
– Recursive feature elimination (RFE)
• Results in EEG
• Procedure and results in ECoG
• Procedure and results in MEG
• Overview of results in EEG, ECoG and MEG