functional brain signal processing: eeg & fmri lesson 2
DESCRIPTION
M.Tech. (CS), Semester III, Course B50. Functional Brain Signal Processing: EEG & fMRI Lesson 2. Kaushik Majumdar Indian Statistical Institute Bangalore Center [email protected]. EEG Processing. Preprocessing Pattern recognition. - PowerPoint PPT PresentationTRANSCRIPT
Functional Brain Signal Processing: EEG & fMRI
Lesson 2
Kaushik Majumdar
Indian Statistical Institute Bangalore Center
M.Tech. (CS), Semester III, Course B50
EEG Processing
Preprocessing
Pattern recognition
EEG Artifacts
Benbadis and Rielo, 2008: http://emedicine.medscape.com/article/1140247-overview
Eye Blink Artifact: Electrooculogram (EOG)
Benbadis and Rielo, 2008: http://emedicine.medscape.com/article/1140247-overview
Matrix Representation of Multi-Channel EEG
M is an m x n matrix, whose m rows represent m EEG channels and n columns represent n time points.
Often during EEG processing we are to find a matrix W such that WM is the processed signal.
EOG Identification by Principal Component Analysis (PCA)
Majumdar, under preparation, 2013
PCA Algorithm (cont.)
PCA Algorithm (cont.)
PCA
Rotation and (Stretching or Contracting)
Performance of PCA in EOG Removal
Wallstrom et al., Int. J. Psychophysiol., 53: 105-119, 2004
EOG
Independent Component Analysis (ICA)
In PCA data components are assumed to be mutually orthogonal, which is too restrictive.
Original data sets
PCA components
ICA (cont.)
PCA will give poor results if the covariance matrix has eigenvalues close to each other.
ICA as Blind Source Separation (BSS)
S1 S4
Four musicians are playing in a room.
From the outside only music can be heard
through four microphones.
No one can be seen.
How the music heard from outside can be
decomposed into four sources?
S2 S3
2 4
3
1
Mathematical Formulation
A is mixing matrix, x is sensor vector, s is source vector and n is noise, which is to be eliminated by filtering.
Mathematical Formulation (cont.)
Given find such that
Any estimation technique of is called an ICA technique or BSS technique in general.
ICA Algorithm: FastICA
Whitening:
Normalization (make mean zero).
Make variance one i.e.,
E expectation, x is the vector of signals and I is identity matrix.
Hyvarinen and Oja, Neural Networks, 13: 411-430, 2000
FastICA (cont.)
B is orthogonal matrix and D is diagonal matrix of E
will satisfy
Whitening complete
Non-Gaussianity
ICA is appropriate only when probability distribution of the data set is non-Gaussian.
Gaussian distribution is of the form
Entropy of Gaussian Variable
A Gaussian variable has the largest entropy among a class of random variables with equal variance (for a proof see Cover & Thomas, Elements of Information Theory). Here we will give an intuitive argument.
Entropy of a Random Variable X
( ) ( ) log ( )En X p X p X dX
0 1 2 3 4 5 6 7-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
t
X =
sin
(10t
)
Deterministic
0 100 200 300 400 500 600 7000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Random
t
X =
ran
dom
(t)
More informationLess (zero) information
Gaussian Random Variable Has Highest Entropy: Intuitive Proof
By Central Limit Theorem (CLT) the mean of a class of random variables (class is signified by uniform variance) follows normal distribution as the number of members in the class tends to infinity (i.e., becomes very large).
Infinite observations hold infinite or maximum amount of information.
Intuitive Proof (cont.)
Therefore a random variable with normal distribution has the highest information content.
So it has the highest entropy.
If each variable in a class of random variables admit only finite number of nonzero values, the one with uniform distribution will have the highest entropy.
Non-Gaussianity as Negentropy
H is entropy and J negentropy. J is to be maximized. When J is maximum y is reduced to a component. This can be shown by calculating the kurtosis for component and sum of components including the said component (See Hyvarinen & Oja, 2000, P. 7).
2
Steps of FastICA after Whitening
g is in the form of either of the two
Exercise
FastICA has been implemented in EEGLAB (in runica function). Remove artifacts from sample EEG data using the ICA implementation in EEGLAB.
Concept of Independence in PCA and ICA
In PCA independence means orthogonality i.e., pairwise dot product is zero.
In ICA independence is statistical independence. Let x, y be random variables, p(x) is probability distribution function of x and p(x,y) is joint probability distribution function of (x,y). If p(x,y) = p(x).p(y) holds we call x and y are statistically independent.
Independence (cont.)
If vectors v1 and v2 are orthogonal they are independent. Say not, then a1v1 + a2v2 = 0 implies, a1v1.v1 + a2v2.v1 = 0 or a1 = 0. Similarly a2 = 0.
If v1 = cv2 then both of them must have same probability distribution or p(v1,v2) = p(v1) = p(v2). If v1 and v2 are linearly independent p(v1,v2) = p(v1).p(v2) may or may not hold.
If p(v1,v2) = p(v1).p(v2) holds then v1 and v2 are linearly independent.
Conditions for ICA Applicability
Sources are statistically independent. Propagation delays in the mixing medium are
negligible. Sources are time varying. Mixing medium delays may affect sources in different locations differently and thereby corrupting their temporal structures.
Number of sources = number of sensors.
References
Benbadis and Rielo, EEG artifacts, eMedicine, available online at http://emedicine.medscape.com/article/1140247-overview, 2008.
Hyvarinen and Oja, Independent component analysis: algorithms and applications, Neural Networks, vol. 13, p. 411-431, 2000.
Majumdar, A Brief Survey of Quantitative EEG Analysis, Chapter 2.
THANK YOU
This lecture is available at http://www.isibang.ac.in/~kaushik