optimization and data mining in epilepsy research w. art chaovalitwongse assistant professor...

100
Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Upload: oswin-jennings

Post on 22-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Optimization and Data Mining in Epilepsy Research

W. Art Chaovalitwongse

Assistant Professor

Industrial and Systems Engineering

Rutgers University

Page 2: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Acknowledgements

Comprehensive Epilepsy Center, St. Peter’s University Hospital Rajesh C. Sachdeo, MD Deepak Tikku, MD

Brain Institute, University of Florida Panos M. Pardalos, PhD J. Chris Sackellares, MD Paul R. Carney, MD

Bioengineering, Arizona State University Leonidas D. Iasemidis, PhD

Page 3: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Agenda

Background: Epilepsy Electroencephalogram (EEG) Time Series Chaos Theory: Dimensionality Reduction Seizure Prediction

Feature Selection Process Monitoring

Concluding Remarks

Page 4: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Facts About Epilepsy

At least 2 million Americans and other 40-50 million people worldwide (about 1% of population) suffer from Epilepsy.

Epilepsy is the second most common brain disorder (after stroke)

The hallmark of epilepsy is recurrent seizures. Epileptic seizures occur when a massive group of

neurons in the cerebral cortex suddenly begin to discharge in a highly organized rhythmic pattern.

Page 5: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Epileptic Seizures

Seizures usually occur spontaneously, in the absence of external triggers.

Seizures cause temporary disturbances of brain functions such as motor control, responsiveness and recall which typically last from seconds to a few minutes.

Seizures may be followed by a post-ictal period of confusion or impaired sensorial that can persist for several hours.

Page 6: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Rationale

Based on 1995 estimates, epilepsy imposes an annual economic burden of $12.5 billion in the U.S. in associated health care costs and losses in employment, wages, and productivity.

Cost per patient ranged from $4,272 for persons with remission after initial diagnosis and treatment to $138,602 for persons with intractable and frequent seizures.

Page 7: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

How To Fight Epilepsy Anti-Epileptic Drugs (AEDs)

Mainstay of epilepsy treatment Approximately 25 to 30% remain unresponsive

Epilepsy surgery Require long-term invasive EEG monitoring 50% of pre-surgical candidates do not undergo respective surgery

Multiple epileptogenic zones Epileptogenic zone located in functional brain tissue

Only 60% of surgery cases result in seizure free Electrical Stimulation (Vagus nerve stimulator)

Parameters (amplitude and duration of stimulation) arbitrarily adjusted As effective as one additional AED dose Side Effects

Seizure Prediction?

Page 8: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Vagus Nerve Stimulator

Page 9: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Open Problems

Is the seizure occurrence random? If not, can seizures be predicted? If yes, are there seizure pre-cursors

preceding seizures? If yes, what measurement can be used to

indicate these pre-cursors? Does normal brain activity during differ from

abnormal brain activity?

Page 10: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Electroencephalogram (EEG) …is a tool for evaluating the physiological state of

the brain. …offers excellent spatial and temporal resolution to

characterize rapidly changing electrical activity of brain activation

…captures voltage potentials produced by brain cells while communicating.

In an EEG, electrodes are implanted in deep brain or placed on the scalp over multiple areas of the brain to detect and record patterns of electrical activity and check for abnormalities.

Page 11: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

From Microscopic to Macroscopic Level (Electroencephalogram - EEG)

Page 12: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Depth and Subdural electrode placement for EEG recordings

LOF

LOFROF

LTDRTD

LTD

LST

LSTRST

Page 13: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Scalp EEG Data Acquisition

Page 14: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

EEG Data Acquisition

Page 15: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Typical EEG Time Series Data

Page 16: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Goals of Research

Test the hypothesis that seizures are not a random process.

Employ data mining techniques to differentiate normal and abnormal EEGs

Employ quantitative analysis to identify seizure pre-cursors

Demonstrate that seizures could be predicted Develop a closed-loop seizure control device

(Brain Pacemaker)

Page 17: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

10-second EEGs: Seizure EvolutionNormal Pre-Seizure

Seizure Post-Seizure

Page 18: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Dimensionality Reduction

The brain is a non-stationary system. EEG time series is non-stationary. With 200 Hz sampling, 1 hour of EEGs is

comprised of 200*60*60*30 = 21,600,000 data points = 43.2MB(assume 16-bit ASCI format) 1 day = 1 hour*24 1 week = 1 hour*168 20 patients = 1 hour*3360

Kilobytes → Megabytes → Gigabytes → Terabytes

Page 19: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Dimensionality Reduction Using Chaos Theory

Chaos in Brain? Chaos in Stock Market? Chaos in Foreign Exchanges (Swedish Currency)? Measure the brain dynamics from EEG time series. Apply dynamical measures (based on chaos theory) to non-

overlapping EEG epochs of 10.24 seconds = 2048 points. Maximum Short-Term Lyapunov Exponent

measures the average uncertainty along the local eigenvectors and phase differences of an attractor in the phase space

Measures the chaoticity of the brain waves

Page 20: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

where M is the number of times we went through the loop above, and N is the number of time-steps in the fiduciary. NΔt = tn - t0

Embed the data set (EEG). Xi = (x(ti),x(ti+τ),…,x(ti+(p-1)τ))T where τ is the selected time lag between the components of each vector in the phase space, p is the selected dimension of the embedding phase space, and ti [1,T-(p-1) τ].

Pick a point x(t0) somewhere in the middle of the trajectory. Find that point's nearest neighbor. Call that point z0 (t0).

Compute |z0 (t0) - x(t0)| = L0.

Follow the ``difference trajectory" -- the dashed line -- forwards in time, computing |z0 (ti) - x(ti)| = L0(i) and incrementing i, until L0(i) > ε. Call that value L0' and that time t1.

Find z1 (t1), the “nearest neighbor” of x(t1), and go to step 3. Repeat the procedure to the end of the fiduciary trajectory t = tn, keeping track of the Li and Li' .

Page 21: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

2-D Example: Circle of initial conditions evolves into an ellipse.

Page 22: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

STLmax Profiles

Pre-Ictal Ictal Post-Ictal

Page 23: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Hidden Synchronization Patterns

Page 24: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

By paired-T statistic:Per electrode, for EEG signal epochs i and j, suppose their STLmax values in the epochs (of length 60 points, 10 minutes) are

1 2 60

1 2 60

{ max , max , , max }

{ max , max , , max },

i i i i

j j j j

L STL STL STL

L STL STL STL

1 2 60

1 1 2 2 60 60

{ , , , }

{ max max , max max , , max max }

ij i j ij ij ij

i j i j i j

D L L d d d

STL STL STL STL STL STL

Then, we calculate the average value, ,and the sample standard deviation, , of .

ijD

d̂ 2 60{ , , , }ij ij ij ijD d d d

The T-index between EEG signal epochs i and j is defined as ,ˆ

60

ij

ijd

DT

How similar are they?Statistics to quantify the convergence of STLmax

Page 25: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Statistically Quantifying the Convergence

Page 26: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

IID (Independent and Identically Distributed) Test

Assumption 1: Within a window of 30 STLmax points, the differences of STLmax values (Dij) between two electrode sites i and j are independent.

To verify this assumption, Employ “portmanteau” test of white noise developed by Ljung and Box.

Assumption 2: Within a wt window of 60 points, the differences of STLmax values between two electrode sites i and j are normally distributed.

To verify this assumption, Employ To check this assumption, we employed the Shapiro-Wilk W test, which is is a well-established and powerful test of departure from normality.

Page 27: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Convergence of STLmax

Page 28: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Models

and are intrinsic parameters.

and ’ are directional coupling strengths.

N = number of oscillators

)()( '

,,,1

ijijji

N

jijiii

i xxzydt

tdx

iiiii yxdt

tdy )(

)()(

iiiiii yxzxdt

tdz

(1)

(2)

(3)

Homoclinic Chaos (Silnikov’s Theorem):

Rössler systems, Lorentz systems, population dynamical systems

Page 29: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

STLmax versus time and coupling

Page 30: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Why Feature Selection?

Not every electrode site shows the convergence. Feature Selection: Select the electrodes that are most likely to

show the convergence preceding the next seizure.

Page 31: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Optimization: We apply optimization techniques to find a group of

electrode sites such that … They are the most converged (in STLmax) electrode

sites during 10-min window before the seizure They show the dynamical resetting (diverged in

STLmax) during 10-min window after the seizure. Such electrode sites are defined as “critical electrode

sites”. Hypothesis:

The critical electrode sites should be most likely to show the convergence in STLmax again before the next seizure.

Optimization Problem

Page 32: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Multi-Quadratic Integer Programming

To select critical electrode sites, we formulated this problem as a multi-quadratic integer (0-1) programming (MQIP) problem with … objective function to minimize the

average T-index among electrode sites

a linear constraint to identify the number of critical electrode sites

a quadratic constraint to ensure that the selected electrode sites show the dynamical resetting

1

1

Problem :

Min f( )

s.t.

{0,1}, 1,...,

T

n

ii

T

i

P

x x Qx

x b

x Dx

x i n

Page 33: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

x is an n-dimensional column vector (decision variables), where each xi represents the electrode site i. xi = 1 if electrode i is selected to be one of the critical electrode

sites. xi = 0 otherwise.

Q is an (nn) matrix, whose each element qij represents the T-index between electrode i and j during 10-minute window before a seizure.

b is an integer constant. (the number of critical electrode sites) D is an (nn) matrix, whose each element dij represents the T-

index between electrode i and j during 10-minute window after a seizure.

α = 2.662*b*(b-1), an integer constant. 2.662 is the critical value of T-index, as previously defined, to reject H0: “`two brain sites acquire identical STLmax values within 10-minute window”

Notation and Modeling

Page 34: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Conventional Linearization Approach for Multi-Quadratic 0-1 Problem

2

i

For each product , we introduce new 0-1 variable ( ).

Note that for 0,1 .

The equivalent linear 0-1 problem is given by:

min

s.

i j ij i j

ii i i i

ij ijj

x x x x x i j

x x x x

q x

i

t.

, for , 1,..., ( )

, for , 1,..., ( )

1 , for , 1,..., ( )

ij i

ij j

i j ij

ij ijj

Ax b

x x i j n i j

x x i j n i j

x x x i j n i j

d x

2

{0,1},0 1, , 1,...,

Note that the number of continuous variables has been increased to ( ).

Note that this problem formulation is computationally inefficient as in

i ijx x i j n

O n

n

creases.

Page 35: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Consider the quadratic 0-1 programming problem

eT = (1,1,…,1) Relax x ≥ 0, we then have the following KKT conditions:

KKT Conditions Approach

Min f( )

s.t.

{0,1}, 1,...,

T

i

x x Qx

Ax b

x i n

. 0

0

0, 0, 0

T

Qx u e y

Ax b

y x

x u y

Min f( )

s.t.

0, 1,...,

T

i

x x Qx

Ax b

x i n

0, , 0Tc A e v

Q is an (nn) matrix.b is an integer constantx is an n-dimensional column vector

Page 36: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Add slack variables a and define s = u.e + a Minimizing slack variables, we can formulate this problem as:

Note that this problem formulation is an efficient approach, as n increases, because it has the SAME number of 0-1 variables (n), and 2n additional continuous variables.

KKT Conditions Approach

Min

0

(1 )

where 0, 0, 0,1 ,

and max

T

iji

j

e s

Qx y s

Ax b

y M x

s y x

M q Q

Min

0

0

0, 0, 0

T

T

e s

Qx y s

Ax b

y x

x s y

Fix x{0,1}

0 (1 )Ty x y M x

Page 37: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

For any matrix Q where qij≥0 We want to prove that P and P are equivalent:

Connections Between QIP problems and MILP problems

Problem :

Min

0 (1)

(2)

(1 ) 0 (3)

0, 0, 0,1 (4)

where max

T

T

iji

j

P

e s

Qx y s

Ax b

y M x y x

s y x

M a

Problem :

Min f( )

s.t.

{0,1}, 1,...,

T

i

P

x x Qx

Ax b

x i n

Equivalent

Page 38: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

0 0 0

0 0 0

Theorem1: "If has an optimal solution there exist , such that

( , , ) is an optimal solution to ."

: . If is an optimal solution to , it is obviou

P x iff y s

x y s P

PROOF Neccessity x P

0

0 0 0

0 0 0

s that

, : 0, 0 such that 0 (1) and 0 (3) .

Choose and s from the above defined set of and s.t. is minimized.

Let us show that ( , , ) is an optimal solution to .

Mu

T

T

y s y s Qx y s y x

y y s e s

x y s P

0 0 0 0 0 0 0

0 0 0 0 0 0 0

0

0

ltiplying (1) by ( ) , we obtain ( ) ( ) ( ) 0.

Note that from (3), ( ) ( ) 0. We then have ( ) ( ) .

We know that arg min , s.t. , {0,1}. If we can prove that

T T T T

T T T T

T

T

x x Qx x y x s

x y y x x Qx x s

x x Qx Ax b x

e s

0 0 0 0 0( ) (5) , then ( , , ) is an optimal solution to . Tx s x y s P

Page 39: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

0 0 0

1 2 1 1 2 2( ) ... ...T Tn n ne s x s s s s x s x s x s

0 0 0

0 0

0 0 0

To prove ( ) (5) , it is sufficient to show that, for any ,

if 0, then 0. We can prove this statement by contradiction.

Proof : Assume that given ( , , ) that is an optimal soluti

T T

i i

e s x s i

x s

x y s

0 0 0

0 0

0

on to ,

0 and 0 for some . ( is minimized)

For any , define vectors and 0, which is not the optimal

solution ( is not minimal). It is clear that ( , ,

Ti i

i i i i

T

P

x s i e s

i y y s s

e s x y s

0 0

0 0 0

) satisfied all contraints

(1) - (4) in . Thus, ( , , ) is feasible and .

This fact contradicts our initial assumption that ( , , ) is an

optimal solution to .

.

T TP x y s e s e s

x y s

P

Sufficiency The proof

.is similar

Page 40: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Consider the MQIP problem We proved that the MQIP program is EQUIVALENT to a MILP problem

with the SAME number of integer variables.

Theoretical Results:MILP formulation for MQIP problem

Problem :1

Min f( )s.t.

{0,1}, 1,...,

P

Tx x Qxb

Tx Dxx i ni

Ax

Problem :1

Min 0 (1)

(2) (1 ) (3) 0 (4)

(5) z '

P

Te sQx y s

by M x

Dx zTe

M x

Ax

z

(6)

, , 0, 0,1 (7)

where max ,

' max

s y z x

M q Qiji jM d Diji j

Equivalent

Page 41: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

0 0 0 0 0 0 0 01

1

Theorem2: "If has an optimal solution there exist , , such that ( , , , )

is an optimal solution to ."

: . From the proof of theorem 1, to prove theorem 2 we onl

P x iff y s z x y s z

P

PROOF Neccessity0 0

1

0 0

y need to show that if is an optimal solution to problem , then there exists vector (s.t. 0) and the

following constraints are satisfied

0

ix P z z

Dx z

0

0 0

0

(1)

(2)

' (3)

From (3), note that if 0 then we have

T

i

e z

z M x

x z

0

0 0 0

0

0 (the proof is similar to the one in theorem 1).

Then we obtain ( ) (4).

Since is a real number and every element of the matrix is nonnegative, for all

iT T

i

e z x z

z D i

0 0 0 0

0 0 0 0 0 0

01

where

we have 1, we can choose 0 such that ( ) . We then satisfy (1) and (3).

Multiplying (1) by ( ) , from (4) we obtain ( ) ( ) .

Since is an optimal solution to , (2)

i i i i

T T T T

x z Dx z

x x Dx x z e z

x P

0 0 0 is satisfied: ( )

. .

T Tx Dx e z

Sufficiency The proof is similar

Reference:

• P.M. Pardalos, W. Chaovalitwongse, L.D. Iasemidis, J.C. Sackellares, D.-S. Shiau, P.R. Carney, O.A. Prokopyev, and V.A. Yatsenko. Seizure Warning Algorithm Based on Spatiotemporal Dynamics of Intracranial EEG. Mathematical Programming, 101(2): 365-385, 2004.

Page 42: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Empirical Results:Performance on Larger Problems

Reference:

• W. Chaovalitwongse, P.M. Pardalos, and O.A. Prokopyev. Reduction of Multi-Quadratic 0-1 Programming Problems to Linear Mixed 0-1 Programming Problems. Operations Research Letters, 32(6): 517-522, 2004.

Page 43: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Empirical Results:Performance on Larger Problems

Page 44: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Hypothesis: The critical electrode sites should be most likely to show the

convergence in STLmax (drop in T-index below the critical value) again before the next seizure.

The critical electrode sites are electrode sites that are the most converged (in STLmax ) electrode sites during 10-

min window before the seizure show the dynamical resetting (diverged in STLmax ) during 10-min

window after the seizure Simulation:

Based on 3 patients with 20 seizures, we compare the probability of showing the convergence in STLmax (drop in T-index below the critical value) before the next seizure between the electrode sites, which are Critical electrode sites Randomly selected (5,000 times)

Hypothesis Testing - Simulation

Page 45: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Optimal VS Non-Optimal

Page 46: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Simulation - Results

Page 47: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

How to automate the system

Page 48: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Select critical electrode sites after every subsequent seizure

EEG Signals

Give a warning when:T-index value is greaterthan 5, then drops to a value of 2.662 or less

Monitor the averageT-index of the critical electrodes

Continuously calculateSTLmax from multi-channel EEG.

ASWA

Automated Seizure Warning System

Page 49: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Data Characteristics

Page 50: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Performance Evaluation for ASWS

To test this algorithm, a warning was considered to be true if a seizure occurred within 3 hours after the warning.

Sensitivity =

False Prediction Rate = average number of false warnings per hour

seizures analyzed of #

seizures predicted accurately of #

Page 51: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Performance characteristics of automated seizure warningalgorithm with the best parameter-settings of training data set.

Training Results

Page 52: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

ROC curve (receiver operating characteristic) is used to indicate an appropriate trade-off that one can achieve between:

the false positive rate (1-Specificity, plotted on X-axis) that needs to be minimized

the detection rate (Sensitivity, plotted on Y-axis) that needs to be maximized.

RECEIVER OPERATING CHARACTERISTICS (ROC)

Page 53: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

ROC curve analysis for the best parameter settings of 10 patients

Page 54: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Test Results

Performance characteristics of automated seizure warning algorithm with the best parameter settings on testing data set.

Page 55: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Validation of the ASWS algorithm

Temporal Properties Surrogate Seizure Time Data Set 100 Surrogate Data Sets

Spatial Properties Non-Optimized ASWS – Selecting non-optimal

electrode sites 100 Randomly Selected Electrodes

Page 56: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Prediction Scores: ASWS

Page 57: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Prediction Scores: Surrogate Data and Non-Optimized ASWS

W. Chaovalitwongse, L.D. Iasemidis, P.M. Pardalos, P.R. Carney, D.-S. Shiau, and J.C. Sackellares. A Robust Method for Studying the Dynamics of the Intracranial EEG: Application to Epilepsy. Epilepsy Research, 64, 93-133, 2005.

Page 58: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Prediction Scores: Surrogate Data and Non-Optimal ASWS

Page 59: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Concluding Remarks Overview of Epilepsy Research Applications of Data Mining and Optimization Techniques Interplay between theory and application The first online real-time seizure prediction system Seizure Prediction

Predicting ~70% of temporal lobe seizures on average Giving a false alarm rate of ~0.16 per hour on average

Ongoing and Future Research Classification of EEGs from normal and epileptic patients Classification of abnormal brain activity Cluster analysis of epileptic brains Analysis on scalp EEGs

Page 60: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

W. Chaovalitwongse, L.D. Iasemidis, P.M. Pardalos, P.R. Carney, D.-S. Shiau, and J.C. Sackellares. A Robust Method for Studying the Dynamics of the Intracranial EEG: Application to Epilepsy. Epilepsy Research, 64, 93-133, 2005.

W. Chaovalitwongse, P.M. Pardalos, and O.A. Prokopyev. EEG Classification in Epilepsy. To appear in Annals of Operations Research.

W. Chaovalitwongse and P.M. Pardalos. Optimization Approaches to Characterize the Hidden Dynamics of the Epileptic Brain: Seizure Prediction and Localization. To appear in SIAG/OPT Views-and-News.

W. Chaovalitwongse , P.M. Pardalos, L.D. Iasemidis, D.-S. Shiau, and J.C. Sackellares. Dynamical Approaches and Multi-Quadratic Integer Programming for Seizure Prediction. Optimization Methods and Software, 20 (2-3): 383-394, 2005 .

L.D. Iasemidis, P.M. Pardalos, D.-S. Shiau, W. Chaovalitwongse, K. Narayanan, A. Prasad, K. Tsakalis, P.R. Carney, and J.C. Sackellares. Long Term Prospective On-Line Real-Time Seizure Prediction. Journal of Clinical Neurophysiology, 116 (3): 532-544, 2005.

P.M. Pardalos, W. Chaovalitwongse, L.D. Iasemidis, J.C. Sackellares, D.-S. Shiau, P.R. Carney, O.A. Prokopyev, and V.A. Yatsenko. Seizure Warning Algorithm Based on Spatiotemporal Dynamics of Intracranial EEG. Mathematical Programming, 101(2): 365-385, 2004. (INFORMS Pierskalla Best Paper Award 2004)

W. Chaovalitwongse , P.M. Pardalos, and O.A. Prokopyev. A New Linearization Technique for Multi-Quadratic 0-1 Programming Problems. Operations Research Letters, 32(6): 517-522, 2004. (Rank 5th in Top 25 Articles in Operations Research Letters)

Reference

Page 61: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Questions?

Thank you

Page 62: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Classification of Brain Activity

Page 63: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Phase Profiles

Page 64: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Entropy H of Attractor

Page 65: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Classification of Physiological States

Page 66: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Nearest Neighbor Time Series Classification

Normal

Pre-Seizure Post-Seizure

A

Page 67: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

By paired-T statistic:Per electrode, for EEG signal epochs i and j, suppose their STLmax values in the epochs (of length 30 points, 5 minutes) are

1 2 30

1 2 30

{ max , max , , max }

{ max , max , , max },

i i i i

j j j j

L STL STL STL

L STL STL STL

1 2 30

1 1 2 2 30 30

{ , , , }

{ max max , max max , , max max }

ij i j ij ij ij

i j i j i j

D L L d d d

STL STL STL STL STL STL

Then, we calculate the average value, ,and the sample standard deviation, , of .

ijD

d̂ 2 30{ , , , }ij ij ij ijD d d d

The T-index between EEG signal epochs i and j is defined as ,ˆ

30

ij

ijd

DT

Similarity Measure for EEG Time Series – T-test

Page 68: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

T-Statistics Distance

The T-index, Txy, between the time series x and y is then defined as:

where E[ ] denotes the average of the value within an epoch of the time series, n is the length of the time series epoch, and σxy is the sample standard deviation of the difference in value of x and y.

Asymptotically, Txy index follows a t-distribution with n-1 degrees of freedom.

n

YEXET

xyxy /

][][

Page 69: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Nearest Neighbor Classification Rules

Given an unknown-state epoch of EEG signals A, we calculate statistical distances between the EEG epoch and the groups of Normal, Pre-Seizure, and Post-Seizure EEGs in our database.

EEG sample A will be classified in the group of patient’s states (normal, pre-seizure, and post-seizure) that yields the minimum T-index distance.

Multiple Electrodes = Multiple Decisions Averaging Voting (Majority voting: selects action with maximum

number of votes)

Page 70: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Preliminary Data Set

132 5-minute epochs of pre-seizure EEGs 132 5-minute epochs of post-seizure EEGs 300 5-minute epochs of normal EEGs

Pre-seizure = 0-30 minutes before seizure Post-seizure = 2-10 minutes after seizure Normal = 10 hours away from seizure

Page 71: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Probability of Correct Classifications

Page 72: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Probability of Correct Classifications

Patient State Classification (Voting - Lmax+Phase) - Sensitivity

95.65%

22.73%25.00%

4.35%

72.73%

10.00%

0.00%4.55%

65.00%

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

90.00%

100.00%

Pre-ictal Post-ictal Inter-ictal

States

Per

cen

tag

e o

f C

lass

ifie

d T

ype

Pre-ictal

Post-ictal

Inter-ictal

Page 73: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Metrics for Performance Evaluation

PREDICTED CLASS

ACTUALCLASS

Class=Yes Class=No

Class=Yes a b

Class=No c d

a: TP (true positive); b: FN (false negative);

c: FP (false positive); d: TN (true negative)

Page 74: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Sensitivity and Specificity

Sensitivity measures the fraction of positive cases that are classified as positive.

Specificity measures the fraction of negative cases classified as negative.

Sensitivity = TP/(TP+FN)Specificity = TN/(TN+FP)

Sensitivity can be considered as a detection (prediction or classification) rate that one wants to maximize.

Maximize the probability of correctly classifying patient states.

False positive rate can be considered as 1-Specificity which one wants to minimize.

Page 75: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

ROC curve (receiver operating characteristic) is used to indicate an appropriate trade-off that one can achieve between:

the false positive rate (1-Specificity, plotted on X-axis) that needs to be minimized

the detection rate (Sensitivity, plotted on Y-axis) that needs to be maximized.

RECEIVER OPERATING CHARACTERISTICS (ROC)

Page 76: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

ROC for Different Classification Methods

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0.000 0.100 0.200 0.300 0.400 0.500 0.600 0.700 0.800 0.900 1.0001-Specificity

Sen

sitiv

ity

Voting

ROC – Performance Characteristics

Entropy

PhaseLmax

Page 77: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

ROC for Different Classification Methods

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0.000 0.100 0.200 0.300 0.400 0.500 0.600 0.700 0.800 0.900 1.0001-Specificity

Sen

sitiv

ity Voting

Average

ROC – Performance Characteristics

Entropy

PhaseLmax

Entropy

Phase

Lmax

Page 78: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

ROC – Performance Characteristics

ROC for Different Classification Methods

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0.000 0.100 0.200 0.300 0.400 0.500 0.600 0.700 0.800 0.900 1.0001-Specificity

Sen

sitiv

ity Voting

Average

L+P+E

Entropy

PhaseLmax

Entropy

Phase

LmaxAverage

Voting

Page 79: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

ROC – Performance Characteristics

ROC for Different Classification Methods

0.000

0.100

0.200

0.300

0.400

0.500

0.600

0.700

0.800

0.900

1.000

0.000 0.100 0.200 0.300 0.400 0.500 0.600 0.700 0.800 0.900 1.0001-Specificity

Sen

sitiv

ity

Voting

Average

L+P+E

L+P

Entropy

PhaseLmax

Entropy

Phase

LmaxAverage

Voting

AverageVoting

Sensitivity = 95.7%Specificity = 75.4%

Page 80: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Results

Page 81: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Any More Sophisticated Method?

Page 82: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Support Vector Machines2-Class Linearly Separable Case

Page 83: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Mathematical Modeling

Page 84: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Leave-one-out Cross Validation

Cross-validation can be seen as a way of applying partial information about the applicability of alternative classification strategies.

K-fold cross validation: Divide all the data into k subsets of equal size. Train a classifier using k-1 groups of training data. Test a classifier on the omitted subset. Iterate k times.

Page 85: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Classification Results

Page 86: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

QP for Clustering

Clustering Epileptic Brains

Page 87: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Hierarchical Clustering

a, b, c, d, e

a d e cb

a, d

b, c

b, c, e

Agglomerative Divisive

Page 88: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Hierarchical Clustering

Agglomerative Divisive a, b, c, d, e

a d e cb

a, d

b, c

b, c, e

Page 89: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Hierarchical Clustering

Agglomerative Divisive a, b, c, d, e

a d e cb

a, d

b, c

b, c, e

Page 90: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Clustering via Concave Quadratic Programming (CCQP) Formulate a clustering problem as a Quadratic

Integer Program (QIP)

where A is an nxn T-index matrix of pairwise distance

λ is a parameter adjusting the degree of similarity within a cluster

xi is a 0-1 decision variable indicating whether or not point i is selected (assigned) to be in the cluster

Page 91: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Advantages In some instances when λ is large enough to make the

quadratic function become concave function. QIP can be converted to a continuous problem (minimizing a

concave quadratic function over a sphere)

Page 92: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

CCQP Algorithm

Page 93: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Patient 1: Box Plot of Average Solution

Lmax

Page 94: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Patient 1: Box Plots of Average Solution

Lmax Phase

Page 95: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Patient 2: Box Plots of Average Solution

Lmax Phase

Page 96: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Kruskal-Wallis Test

…is a nonparametric version of the one-way ANOVA

…is an extension of the Wilcoxon rank sum test to more than two groups

…compares samples from two or more groups. …compares the medians of the samples in X,

and returns the p-value for the null hypothesis that all samples are drawn from the same population (or equivalently, from different populations with the same distribution).

Page 97: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Assumptions

The Kruskal-Wallis test makes the following assumptions about the data in X: All samples come from populations having the

same continuous distribution, apart from possibly different locations due to group effects.

All observations are mutually independent. The classical one-way ANOVA test replaces

the first assumption with the stronger assumption that the populations have normal distributions.

Page 98: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

T-test Test the hypothesis of

the difference in means of two samples

Determine whether two samples, x and y, could have the same mean when the standard deviations are unknown but assumed equal.

Asymptotically, Txy index follows a t-distribution with n-1 degrees of freedom.

Page 99: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Results – Significance Level

Page 100: Optimization and Data Mining in Epilepsy Research W. Art Chaovalitwongse Assistant Professor Industrial and Systems Engineering Rutgers University

Concluding Remarks

Overview of Epilepsy Research Applications of Data Mining and Optimization

Techniques Interplay between theory and application Quadratic Programming for Feature Selection Quadratic Programming for Clustering Long-Term Monitoring Analysis