machine learning for prediction of anticonvulsive drug ... · sinisa colic doctor of philosophy the...

I

Machine Learning for Prediction of Anticonvulsive Drug Treatment Outcomes in Mecp2-Deficient Mice

by

Sinisa Colic

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

The Edward S. Rogers Sr. Department of

Electrical and Computer Engineering University of Toronto

© Copyright by Sinisa Colic 2017

ii

Machine Learning for Prediction of Anticonvulsive Drug Treatment Outcomes in Mecp2-Deficient Mice

Sinisa Colic

Doctor of Philosophy

The Edward S. Rogers Sr. Department of

Electrical and Computer Engineering University of Toronto

2017

Abstract

Anticonvulsive drug (ACD) treatments produce inconsistent outcomes, often necessitating patients

to go through several drug trials before a successful treatment can be found. In this thesis we apply

a novel approach, using machine learning techniques to predict epilepsy treatment outcomes of

commonly used ACDs. Machine learning algorithms were trained and evaluated using features

obtained from intracranial electroencephalogram (iEEG) recordings of the epileptiform discharges

observed in Mecp2-deficient mouse model of the Rett Syndrome. Our work on Mecp2-deficient

mice focuses on low frequency oscillations (LFO), high frequency oscillations (HFO) and their

interactions through cross-frequency coupling (CFC) to reveal iEEG based biomarkers that track

epileptic seizure pathology. Our findings revealed: variability across discharge events using iEEG

recordings, progression of longer duration discharges over five developmental time points, and the

increased cross-frequency coupling index ICFC of the delta (2-5 Hz) rhythm with the fast ripple

(400-600 Hz) rhythm in discharge events. These results suggest a link between long duration

discharges with elevated ICFC to the epileptic seizure pathology. Using the ICFC to label post-

treatment outcomes we trained Support Vector Machines (SVM) and Random Forest (RF) machine

iii

learning classifiers on time-based, normalized power and CFC comodulogram features to predict

the efficacy of ACD treatments. The results indicate that the performance of the comodulogram

features yielded better predictions and were further improved when combined with time-based

features. Hence, machine learning techniques were able to rank ACDs by estimating likelihood

scores for successful treatment outcome. Identifying the most appropriate ACD treatment a priori

would reduce the burdens of drug trials and provide patient specific treatment options that could

lead to substantial improvements in patient quality of life.

iv

Acknowledgements

To my supervisor Dr. Berj Bardakjian for your constant encouragement, compromise and abundant

optimism that always manages to light the way through any obstacles.

To my supervisory committee, Dr. Quaid Morris, Dr. Kostas Plataniotis, and Dr Liang Zhang, for

your invaluable comments, feedback and unique perspectives. To my research collaborators, Dr.

James Eubanks, Rob Wither and Min Lang for all your experimental data contributions,

knowledge, and helpful suggestions.

To my fellow lab group members, past and present, for their helpful discussions, advice and all the

fun lab adventures over the years. Thank you: Mark Aquilino, Vanessa Breton, Dr. Marija Cotic,

Joshua Dian, Fihras Farah, Vasily Grigorovsky, Dr. Mirna Guirgis, Trevor Hilton, Daniel Jacobs,

Dr. Eunji Kang, Angela Lee, Dr. Ryan McGuinn, Dr. Demitre Serletis, Dr. David Stanely, Sam

Talasila, Uilki Tufa and Dr. Osbert Zalay. Special thanks to: Joshua Dian for setting up and

maintaining the Big Bang Theory computer cluster, Dr. Ryan McGuinn for providing the ICFC

surrogate analysis implementation, and Dr. Osbert Zalay for the multiple MATLAB toolboxes.

To my friends, for their encouragement, endless jokes and intermittent social distractions that have

provided ample breaks from research and opportunities to refresh.

To my family for their unconditional love and support. My parents for making tremendous

sacrifices without which I would not be here today. Most importantly to my beautiful wife for all

the times spent proofreading my writing and for patiently standing by my side on this roller coaster

ride called a PhD.

v

Table of Contents

List of Abbreviations ............................................................................................................. viii

List of Figures ............................................................................................................................ x

List of Tables ......................................................................................................................... xiii

Chapter 1 .................................................................................................................................... 1

Introduction ................................................................................................................................ 1

1.1 Epilepsy .......................................................................................................................... 1

1.1.1 Animal Models of Epilepsy ............................................................................... 3

1.1.2 Mecp2-Deficient Mouse Model of Epilepsy ...................................................... 4

1.1.3 Mechanisms of Mecp2 Deficiency .................................................................... 4

1.2 Electrical Rhythms in the Brain ..................................................................................... 5

1.2.1 Low Frequency Oscillations .............................................................................. 6

1.2.2 High Frequency Oscillations .............................................................................. 8

1.2.3 LFO-HFO Cross-Frequency Coupling ............................................................ 10

1.3 Prediction of Treatment Outcome ................................................................................ 12

1.4 Objectives and Hypothesis ........................................................................................... 14

1.4.1 Objectives ........................................................................................................ 14

1.4.2 Hypothesis ........................................................................................................ 15

1.5 Outline of Chapters ...................................................................................................... 16

Chapter 2 .................................................................................................................................. 17

Methodology ............................................................................................................................ 17

2.1 Experimental Setup ...................................................................................................... 19

2.1.1 Animal Subjects ............................................................................................... 19

2.1.2 Implantation Surgery ....................................................................................... 20

2.1.3 iEEG Recording and Analysis ......................................................................... 20

vi

2.1.4 Established and Experimental Anticonvulsive Drug Treatments .................... 21

2.2 Signal Processing ......................................................................................................... 23

2.2.1 Automated Discharge Detection ...................................................................... 23

2.2.2 Delta Power ...................................................................................................... 25

2.2.3 Time-Frequency Analysis ................................................................................ 25

2.2.4 Comodulogram ................................................................................................ 26

2.2.5 Empirical Mode Decomposition ...................................................................... 30

2.3 Machine Learning Algorithms ..................................................................................... 32

2.3.1 Support Vector Machines ................................................................................ 32

2.3.2 Random Forest ................................................................................................. 35

2.3.3 Feature Extraction ............................................................................................ 37

2.3.4 Feature Selection and Reduction ..................................................................... 38

2.3.5 Evaluation ........................................................................................................ 40

2.4 Statistical Analyses ...................................................................................................... 42

2.4.1 Gamma Fit ....................................................................................................... 42

2.4.2 Surrogates ........................................................................................................ 43

2.4.3 Standard error and t-tests ................................................................................. 44

Chapter 3 .................................................................................................................................. 45

Long Duration Discharges as Biomarkers of Epileptiform Pathology .................................... 45

3.1 Analysis of Discharge Durations ................................................................................. 45

3.2 Gamma Distribution ..................................................................................................... 48

3.3 Clustering ..................................................................................................................... 51

3.4 Delta and Theta LFOs .................................................................................................. 52

3.5 Long Duration Discharges ........................................................................................... 55

3.6 Discussion .................................................................................................................... 56

Chapter 4 .................................................................................................................................. 61

vii

LFO-HFO Cross-Frequency Couplings as Biomarkers of Epileptiform Pathology ................ 61

4.1 HFO Activity ............................................................................................................... 62

4.2 Cross-Frequency Coupling .......................................................................................... 66

4.3 Mecp2 Gene Reactivation ............................................................................................ 71

4.4 Treatment Outcomes of ACDs ..................................................................................... 74

4.5 Discussion .................................................................................................................... 76

Chapter 5 .................................................................................................................................. 79

A Machine Learning Approach to Treatment Outcomes ......................................................... 79

5.1 Dataset .......................................................................................................................... 79

5.2 Feature Selection .......................................................................................................... 81

5.3 Training ........................................................................................................................ 84

5.4 Treatment Prediction .................................................................................................... 84

5.5 Discussion .................................................................................................................... 88

Chapter 6 .................................................................................................................................. 91

Conclusions and Future Work ................................................................................................. 91

6.1 Significant Contributions ............................................................................................. 92

6.2 Future Directions ......................................................................................................... 93

References ................................................................................................................................ 99

viii

List of Abbreviations

ACD Anticonvulsive Drugs

CFC Cross-Frequency Coupling

CRG Cognitive Rhythm Generator

CWT Continuous Wavelet Transform

DBS Deep Brain Stimulation

EEG Electroencephalogram

EEMD Ensemble Empirical Mode Decomposition

EMD Empirical Mode Decomposition

FDA Food and Drug Administration

FIR Finite Impulse Response

FN False Negative

FP False Positive

FPR False Positive Rate

GAERS Genetic Absence Epilepsy Rat from Strasbourg

HFO High Frequency Oscillations

HMM Hidden Markov Model

ICFC Index of Cross-Frequency Coupling

iEEG Intracranial Electroencephalogram

ix

IMF Intrinsic Mode Function

LFO Low Frequency Oscillations

LFP Local Field Potential

MECP2 Methyl-CpG-Binding Protein 2

ML Machine Learning

MRI Magnetic Resonance Imaging

mRMR Minimum Redundancy Maximum Relevance

PCA Principle Component Analysis

QP Quadratic Programming

RBF Radial Basis Function

REM Rapid Eye Movement

RF Random Forest

ROC Receiver Operating Characteristic

ROI Region of Interest

SLE Seizure-Like Events

SVM Support Vector Machines

TN True Negative

TP True Positive

TPR True Positive Rate

T-SNE t-Distributed Stochastic Neighbor Embedding

x

List of Figures

Figure 1. Overview of experimental design, signal processing and machine learning methods. 18

Figure 2. Mecp2-deficient mouse model of Rett Syndrome. ....................................................... 22

Figure 3. Automated epileptiform discharge detection applied on a 10 second iEEG segment. . 24

Figure 4. Normalized time-frequency analysis of discharges observed during epileptiform

events. ........................................................................................................................................... 27

Figure 5. Step-by-step computation of the ICFC measure for a particular HFO-LFO frequency

pair. ............................................................................................................................................... 28

Figure 6. Decomposition of time-series recordings using EEMD. .............................................. 31

Figure 7. Description of SVM algorithm. .................................................................................... 34

Figure 8. Summarization of the Random Forest algorithm. ........................................................ 36

Figure 9. Features used for training the treatment prediction algorithms. ................................... 39

Figure 10. ROCs used for evaluation of treatment outcome predictions for different machine

learning algorithms. ...................................................................................................................... 41

Figure 11. Gamma distribution comparison for different alpha (α) values. ................................ 43

Figure 12. Time-series iEEG recordings obtained from Mecp2-deficient mouse model showing

changes of discharge occurrence and duration with age. .............................................................. 46

Figure 13. Average discharge occurrence, duration and frequency across five developmental

points in time. ................................................................................................................................ 47

Figure 14. Histograms of discharge durations and inter-discharge durations with the

corresponding gamma fit overlaid in red after 14-18 months of development. ............................ 49

xi

Figure 15. Comparison of alpha values obtained from ictal and interictal events from Mecp2-

deficient mouse model against values obtained from patients with absence epilepsy and other

animal models. .............................................................................................................................. 50

Figure 16. Distributions of discharge occurrences over 24 hour periods obtained from mice 14-

18 months of age. .......................................................................................................................... 51

Figure 17. Analysis of the presence of clustering using Ripley’s K statistic. ............................. 52

Figure 18. Comparison of discharge counts per hour for the day/night cycles and high/low delta

frequency power. ........................................................................................................................... 54

Figure 19. Comparison of discharge and inter-discharge durations for high and low delta

frequency power regions. .............................................................................................................. 55

Figure 20. Tracking of long and short duration discharge counts across five developmental time

points. ............................................................................................................................................ 57

Figure 21. Z-score normalized time-frequency distribution of a discharge. ................................ 63

Figure 22. Progression of time-frequency distribution across five developmental time points. . 64

Figure 23. Quantization of fast ripple HFO presence associated with discharges. ...................... 65

Figure 24. Comodulogram of LFO-HFO cross-frequency coupling in a Mecp2-deficient mouse.

....................................................................................................................................................... 68

Figure 25. EEMD extraction of delta, theta and fast ripple rhythms for the purposes of

computing cross-frequency coupling index. ................................................................................. 69

Figure 26. Tracking of phase-amplitude, cross-frequency coupling of delta – HFO and theta –

HFO in long and short duration discharges over five developmental time points. ....................... 70

Figure 27. Characterization of discharges in Mecp2-deficient mice pre- and post- mecp2 gene

reactivation. ................................................................................................................................... 72

xii

Figure 28. Phase-amplitude cross-frequency coupling of delta – HFO and theta – HFO in long

and short duration discharges compared pre- and post- mecp2 gene reactivation. ....................... 73

Figure 29. Delta-HFO CFC in long duration discharges used as a biomarker to evaluate

treatment outcome. ........................................................................................................................ 75

Figure 30. Low dimensional feature projections using t-SNE. Each subjects was identified by a

unique colour. ............................................................................................................................... 82

Figure 31. Low dimensional feature projections for the anticonvulsive drug THIP. .................. 83

Figure 32. ROC evaluation of feature sets and machine learning methodologies. ...................... 85

Figure 33. Predicted likelihood of favorable treatment outcome across four commonly used

ACDs. ............................................................................................................................................ 87

Figure 34. Combined LFO and HFO features identify channels in the ROI. .............................. 96

Figure 35. Schematic of responsive neuromodulation protocol, combining the CRG model of

Seizure-Like Events with RBF, periodic, random-repetitive and random modulation modalities.

....................................................................................................................................................... 98

xiii

List of Tables

Table 1. Examination of delta-HFO phase-amplitude coupling pre and post anticonvulsive drug

administration. .............................................................................................................................. 80

Table 2. Summary of feature reduction applied for THIP drug treatment prediction. ................ 88

1

Chapter 1

Introduction

1.1 Epilepsy

Epilepsy is characterized by disruption in the operation of normal brain activity, often described

as electrical storms in the brain. The word “epilepsy” is derived from Latin and Greek words for

“seizure” or “to seize upon” [1]. As a neurological disease, epilepsy has no origin and can be traced

as far back in time as medical records existed. One of the earliest records of the disease can be

found in the Babylonian medical text from 1100BC which details the treatment and likely

outcomes and characterizes many features of the different seizure types [2]. Due to their limited

biomedical understanding, the Babylonians attributed the seizures to possession by evil spirits and

called for treatment through spiritual means. This sort of spiritual thinking continued throughout

the Greek and Roman times, well into the modern age. Treatments often consisted of spiritual

means - in the dark ages, epilepsy was associated with demons and it was believed that drilling a

hole into the skull would expel the demon and provide treatment. These notions of evil spirits and

demons were common place well into the 17th century.

The landscape started to change in the mid-1800s when anticonvulsive drugs (ACDs) were

discovered [1]. As our knowledge of the brain improved and pharmacological treatments become

available, scientific reason prevailed. The associations to evil spirits slowly disappeared, and were

replaced by scientifically founded postulations.

1.1 Epilepsy 2

The scientific successes of the 20th century revealed epilepsy to be a dynamic neurological disease

[3], associated with increased hyperexcitability which leads to disruption in normal brain activity.

Disruptions in brain processes leads to abnormal neuronal activities, known as seizures, which can

occur localized or generalized, with widely varying symptoms and manifestations. The underlying

causes have been organized into three subgroups: genetic, structural-metabolic and unknown

etiology [4].

Epilepsy afflicts approximately 1.8% of the population [5]. The vast majority of patients can be

treated with ACDs; however, variability in epilepsy etiology translates to variability in treatment

outcome. Thus, not all patients respond favorably to anticonvulsive drugs. Approximately 20-40%

of patients develop what is known as intractable epilepsy, where medication does not lead to

seizure freedom [6, 7]. For these patients, the only treatment options are surgery or

neuromodulation.

The variability in ACD treatment outcome can best be seen in the treatment of epileptic seizures

in Rett Syndrome. Rett Syndrome is defined as a neurodevelopmental disorder with one of the

most common causes of mental retardation in females. It is an X-linked disorder that primarily

affects girls at a rate of approximately 1:10,000 live female births. Rett Syndrome is caused by

mutations in the gene encoding Methyl-CpG-Binding Protein 2 (MECP2) [8]. Unlike most forms

of epilepsy, the epilepsy resulting from Rett Syndrome typically does not respond favorably to

common ACDs and can require an exhaustive search to find an effective treatment option [9, 10].

Our brief glimpse into the history of epilepsy reveals that when little was known about the disorder,

the prevailing belief was that it was caused by daemons and evil spirits. As new information

became available and our scientific understanding grew, we learned that there is a physical basis

for the disorder and possible treatments. Further understanding of epileptogenesis will only lead

to improved treatment options and better treatment selection.

1.1 Epilepsy 3

1.1.1 Animal Models of Epilepsy

Investigation of epileptogenesis and evaluations of novel treatments is mainly achieved through

the study of animal models of epilepsy. A large variety of animal models have been created, which

include: pharmacological (e.g. pilocarpine, kainite), electrical (e.g. kindling), genetic (e.g. knock-

out mice), and injurious (e.g. trauma, hypoxia, stroke); each designed to mimic the equally

numerous types and causes of epilepsy in humans [11]. These various models of epilepsy can be

further categorized as: in-vitro and in vivo, with each having its own advantages and disadvantages.

The in-vitro models, including brain slices, cell cultures and molecular assays, provide a reduced

biological system and are ideally suited for obtaining precise cellular recordings and

pharmacological responses of the cell. The low magnesium hippocampal slice model is well

established in literature owing to its reliable effects in-vitro and ability to generate spontaneous

and recurrent seizure-like events (SLEs) that resemble SLEs observed in vivo [12]; however, like

all slice models, it is limited to measuring local network interactions which are not a true

representation of the brain. Other models have been generated to get around the limitation of local

network interactions, for example the whole-hippocampal slice model records directly from the

intact hippocampus and preserves more of the network [13, 14]. In addition to being confined to

local networks, in-vitro models are difficult to maintain for long periods of time and therefore are

generally limited to short term recordings.

Unlike in-vitro models, in vivo models are not limited to local network connections, can be

maintained for long term studies, and also exhibit behavioural and electroencephalographic seizure

like events (discharges) that more closely mimic the clinical features of human epilepsy [11]. The

kindling and kainic acid models are two commonly used in vivo models of epilepsy which work

by applying pharmacological and electrical stimulation, respectively, to induce SLEs [15, 16],

resulting in irreversible brain damage [15, 16]. The kindling technique may be employed on a wide

variety of species, such as dogs, rabbits, cats and monkeys [16-19]. Although they are effective at

generating SLEs, they result in significant brain damage [15, 16]. The Mecp2-deficient mouse

model is an in vivo model of epilepsy that avoids the brain damage resulting from the kainic acid

and kindling techniques. Furthermore, the Mecp2-deficient mouse model is obtained using genetic

1.1 Epilepsy 4

knock-out mice that can be treated, or rescued, using gene reactivation techniques, thus making it

ideally suited for studying the effects of anticonvulsive drug (ACD) treatment outcomes.

1.1.2 Mecp2-Deficient Mouse Model of Epilepsy

The Mecp2-deficient mouse model is an in vivo model of Rett Syndrome. It recapitulates many of

the behavioural and neurological deficits observed clinically in Rett Syndrome and shows

spontaneous and recurrent discharges similar to those observed in other epilepsy models [20].

These discharges exhibit absence-like characteristics and are severely attenuated with the

administration of ethosuximide, an anti-absence drug [21]. It has been suggested that these

rhythmic spike and wave discharges result from thalamo-cortical network dysfunction, as has been

reported in other rodent models with absence-like discharges [22, 23]. Mecp2-deficient mouse

models result from either a lack of Mecp2 or the expression of a relevant mutant form of Mecp2

[24]. One of the earliest Mecp2-deficient mouse models is the Mecp2tm1.1Bird model, which results

from the excision of exons 3 and 4 of the Mecp2 gene, resulting in deletion of the amino-terminal

eight amino acids of the Mecp2 protein [25]. Since the earliest models, there have been several

improvements to allow for selective activation of the Mecp2 protein at specific time points in the

development of the mouse using Tamoxifen, which is an estrogen receptor antagonist [20]. By

selectively reactivating the Mecp2 function, it is possible to assess if the effects of Mecp2-

deficiency during developmental states has irreversible detrimental effects [24, 26]. Furthermore,

this opens the possibility to studying the pre- and post- gene reactivation effects to determine

biomarkers of epileptiform activity.

1.1.3 Mechanisms of Mecp2 Deficiency

A common phenotype in mouse models lacking Mecp2 is cortical network hyperexcitability [27,

28]. Network hyperexcitability has been associated with cortical spike and wave discharges that

1.2 Electrical Rhythms in the Brain 5

have been identified in EEG recordings [21, 26, 29]. Hyperexcitabilty resulting from Mecp2

deficiency appears to be specific to brain regions. Cortical network hyperexcitability has been

observed in mice globally lacking Mecp2, which was not seen in mice selectively lacking Mecp2

from HOX-B1 hind brain neurons [30]. Hyperexcitability has also been observed in mice lacking

Mecp2 in the forebrain inhibitory neurons, which was not seen in mice selectively lacking Mecp2

in extracortical forebrain neurons [31]. Together these results suggest that specific GABAergic

circuits may be responsible for network hyperexcitability in Mecp2-deficient mice [32].

The mechanism involved in network excitability of Mecp2-deficient mice is thought to result from

enhancement of GABAB receptors due to diminished cortical expression of GABA transporter

GAT-1. [32]. Through enhancement of GABAB receptor activity using GS-39783 (2.5 mg/kg), a

GABAB receptor-positive allosteric modulator, Zhang et al., showed significantly elevated

discharge incidence rates for Mecp2-deficient mice. At higher dosages, GS-39783 (5.0 mg/kg) was

even able to induce discharge events in female wild-type mice. Administration of NO-711 (1.0

mg/kg), a GAT-1 protein inhibitor, in female Mecp2-deficient mice, also resulted in a significant

increase in discharge incidence rate. Furthermore, administering an increased dosage of NO-711

(2.5 mg/kg) in female wild-type mice was also shown to induce discharge events [32]. Previous

studies have shown that impairment of GAT-1 enzymatic activity is linked to the genesis of

discharge events in cortical circuits [33]. Selective stimulation of extra-synaptic GABAA receptor

activity using THIP, a selective agonist for GABAA receptor delta-subunit, resulted in a decrease

of excitatory discharge patterns. Together these findings support the link between the reduction in

GAT-1 and the network hyperexcitability observed in Mecp2-deficient mice.

1.2 Electrical Rhythms in the Brain

Characterization of epilepsy is a vital step in determining an appropriate treatment. The best way

to achieve this characterization is to measure the electrical activity in an epileptic brain and

compare it with the normal, or expected activity in a healthy brain. Through this comparison it is

possible to obtain biomarkers of the pathology.


The electrical signals measured in the brain, referred to as electroencephalograms (EEGs), provide

a measure of the mean electrical activity between two points in the brain, one point being the point

of interest and the other being a reference.

EEG recordings can be obtained from outside the head using scalp EEG, or invasively from inside

the head using implantable electrodes, referred to as intracranial EEG (iEEG). The non-invasive

scalp EEG is placed on the head in a 10-20 system evenly distributed along the surface. Due to

skull, hair and distance from the brain, the signals obtained this way are prone to artifacts.

Intracranial EEG overcomes these limitations, though at the price of surgical implantation and the

risks associated with it. Intracranial recordings can be achieved using a flat strip of electrodes

placed on the surface of the cortex, or using electrodes inserted deeper inside the brain. While

iEEG is limited to a smaller spatial region, these recordings provide a more detailed representation

of the underlying electrical activity, and allow access to deeper brain structures whose activity may

not be detected by scalp EEG [34].

Since the first recordings conducted by Berger in 1924, there have been numerous discoveries

facilitated by EEGs. One of the most notable discoveries are the brain rhythms - electrical

oscillations found in different frequency bands. These rhythms can be broadly categorized as low

frequency oscillations (LFOs) and high frequency oscillations (HFOs). Their presence and

interactions have been shown to play vital roles in normal functioning brains, revealing insights

into epilepsy and other neurological pathologies [35].

1.2.1 Low Frequency Oscillations

EEGs recordings from the brain are in the microvolts range, which leads to numerous technological

challenges. The primary challenge is the elimination of artifacts. To avoid artifacts, the initial

acquisitions were limited to LFOs which are defined to be less than 30 Hz [36]. Over the years

LFOs have been clinically categorized into four frequency ranges [35]:

0.5-4 Hz – Delta band


4-8 Hz – Theta band

8-12 Hz – Alpha band

12-30 Hz – Beta band

Normal theta band activity is further classified into two types: hippocampal theta rhythms and

cortical theta rhythms. Hippocampal theta rhythms originate in the hippocampus. Due to the

strong link between the hippocampus and motor behavior, hippocampal theta rhythms are thought

to be linked to sensorimotor processing and mechanisms of learning and memory [37]. Cortical

theta rhythms are frequently observed in young children and can also show up during meditation,

drowsiness and some of the early stages of sleep [38]. Theta rhythms can also show up in cases of

pathology. For example, in epilepsy the theta rhythm is the predominant rhythm found during

epileptic seizures [35]. This abnormal theta activity observed during epileptic seizures is usually

characterized by large amplitude spikes.

Normal delta rhythms have been found to identify deep stages of REM, particularly stages 3 and

4 [38]. Delta is related to physical awareness and consciousness, and can be induced through deep

meditation. Studies have found that cognitive tasks may elicit delta network activity. Delta rhythms

have also been associated with motivation, and have been found to increase in power during

periods of attention or pain.

During wakefulness, delta activity is often overpowered by activity in higher frequency bands, but

becomes dominant when these higher frequencies are pathological in nature. Sleep studies have

revealed that epileptiform activity is influenced by the presence of the delta band, with some

studies showing that the rate of occurrence and characteristics of seizures change during the

presence of high delta rhythms [39]. In the absence of spiking activity, the presence of delta slow

waves was more prevalent with uncontrolled seizures [40]. Slowing of interictal delta activity has

been observed in patients with temporal lobe epilepsy [41] and has been linked to positive surgical

outcomes suggesting that delta could play a role as a biomarker for surgical treatment [42].


Theta and delta LFO frequency bands have been extensively studied in relation to epilepsy. LFOs

by their nature can traverse further spatially and have a greater effect on the communication

between various regions of the brain [43], potentially maintaining the seizure episode once it

begins.

1.2.2 High Frequency Oscillations

With the arrival of modern digital recording in the 1980s, it became possible to record frequencies

greater than 80 Hz, which are referred to as HFOs. HFOs have been found in both animal [44, 45]

and human studies [45-50] to be present during normal and pathological activities in the brain.

Under normal brain function, HFOs have been found in various brain structures, though

particularly in the motor cortex during finger movements [51], and in the temporal lobe during

working memory tasks [52]. HFOs have also been linked to numerous tasks including language,

attention, visual and auditory tasks. It has been hypothesized that HFOs play a role in cortical

processing, rather than acting as a marker for any given brain region [53]. HFOs are typically

classified into two frequency bands: ripples (80 – 250 Hz) and fast ripples (> 250 Hz) [54].

The most notable feature separating the ripples and fast ripples is their spatial origin. It has been

shown that ripples generated from hippocampus or parahippocampal structures are often

associated with normal physiological activity, while those associated with the dentate gyrus are

often considered to be pathological in nature. Buzsaki et. al, provide evidence that ripples play a

role in memory consolidation. In rodent experiments where ripples were selectively suppressed

after learning using timed electric stimulation, large impairments in daily performance were

observed, that were comparable to surgically damaging the hippocampus [55]. It has been

postulated that ripples constitute a replay of important events, so that they can be encoded in a

more permanent manner [56]. Furthermore, Grenier et al., found ripples to have a preferential

presence during slow-wave sleep, during which time the brain is disconnected from the outside

world, and their link to strong neuronal activity suggest that ripples may be involved in plasticity


processes [57]. It has been suggested that ripples result from inhibitory field potentials which may

be involved in strong coherence of long-range neuronal activity [44, 58]. Studies on epilepsy have

found that ripples accompanied by continuous/semi-continuous background EEG activity are more

prevalent in the hippocampus and occipital lobe, but show no correlation to the seizure onset or

lesion sites [59], suggesting a physiological neuronal activity rather than pathological. Fast ripples

recorded in both rodent and human somatosensory cortex and are believed to reflect the

coordinated discharge of neurons involved with the processing of incoming sensory information

[60-62]. In general, fast ripples are believed to be pathological in nature and have been suggested

to be the result of the strong coherence of abnormally bursting neurons [63]. Some have stipulated

that fast ripples are harmonics of the ripples [64]; however, the difference in the neuronal activity

involved and the different spatial origins suggest otherwise [65].

In recent years, HFOs, and in particular fast ripples, have gained prominence as biomarkers well

suited for the identification of seizure pathology [49]. HFOs have been recorded as preceding

seizure onset in epileptic patients [66]. Additionally, HFOs have been shown to provide diagnostic

value. In particular, fast and abnormal HFOs in the 150-500 Hz frequency range have been

recorded during interictal periods from the hippocampus and entorhinal cortex of patients with

mesial temporal lobe epilepsy [62, 63]. HFOs, unlike the delta and theta LFOs, have been shown

to be spatially confined to narrow regions and permit reliable localization of seizure onset zone

(SOZ) [67, 68] by measuring increases in power [69, 70]. Finding the SOZ may allow for focused

intervention, whether through resection surgery or some form of electrical neuromodulation [71-

73].

The clinical implications of finding the seizure focus have made HFOs a prominent topic of

research over the past decade. Recent findings have shown that resection of tissue associated to

HFOs are linked to positive surgical outcomes in both adults [74] and children [69, 75].

Retrospective correlations of resection of regions with HFO activity with post-surgical outcomes

have been performed in a number of studies [74-76]. Independent studies performed by Jacobs et

al. and Wu et al., identified that patients with a positive postsurgical outcome had a larger portion

of HFO-associated tissue resected compared to patients with a poor post-surgical outcome. Fast

ripple oscillations were detected retrospectively in 80% of patients and in cases where complete


resection of tissue containing fast ripple activity was achieved, the resection was shown to correlate

with seizure freedom [75].

Furthermore, many studies have tried to used HFO threshold rates as a way to distinguish

epileptogenic regions, but this can be problematic, since the thresholds can be affected by sleep

[62, 77] or medication [78], and often show signs of being dependent on brain region and patient

specific. In addition to surgical correlations, HFOs have also been linked to therapeutic

interventions using ACDs. After medication is withdrawn prior to assessment of surgical

candidacy, increases in the number of HFOs have been observed [78], suggesting that iEEG-based

HFO biomarkers can track changes due to ACDs. Differentiating physiological HFOs from

pathological HFOs is not straightforward, as they share similar and overlapping frequencies [65].

It is not completely clear which characteristics of the HFO activity are clinically relevant; hence,

additional analysis techniques, such as the interactions of LFOs and HFOs, are warranted.

1.2.3 LFO-HFO Cross-Frequency Coupling

Multitudes of oscillations spanning LFO and HFO frequency ranges have been identified

throughout the many regions of the brain. LFOs and HFOs, both on their own and through their

interaction, have been shown to characterize brain state activity. This interaction of rhythms,

referred to as cross-frequency coupling (CFC), is best understood in studies on the hippocampus

where neural coding in the form of nested oscillations has been observed to facilitate short-term

memory storage [79]. Coupled oscillations are ultimately suggested to be a timing mechanism for

the serial processing of short-term memories. Lisman et al., [80] argues that theta and gamma

activity are part of a common functional system and they form a theta-gamma code that allows the

hippocampus to process and recall long-term memories. It has been suggested that each memory

is stored in a different 40 Hz sub-cycle of a particular LFO, that is the amplitude of a higher

frequency signal tends to show a preference for a particular phase of the lower frequency signal

[81-84]. Studies on rodents show that CFCs coincides to locations in a maze, with different cells

firing as a rodent traverses a maze, hence the term “place cells”. The systematic progression of the


phase as the spatial location changes is referred to as the phase precession. These studies on place

cells suggest phase-amplitude cross-frequency interactions are important for neural coding [80].

The ability of neural rhythms to carry out physiological functions, as seen with the theta-gamma

neural code [85, 86], also suggests that disruptions in the neural codes can lead to pathology.

Disruptions in neural code are best seen in studies on epilepsy, which show that interactions of

rhythms or neuronal code can be correlated with surgical outcome [87]. Melanie et al., showed

that surgical resection of regions showing fast ripples and ripples coexisting with flat background

EEG activity would lead to seizure freedom, whereas resection of areas generating ripples with a

continuously oscillating background EEG pattern did not result in favourable surgical outcomes

[87]. Phase – amplitude CFC of ripples with delta rhythms have been documented during ictal

states [88]. Guirgis et al. highlights that HFO presence is mainly linked to pathology when other

rhythms are present, specifically the delta (< 4 Hz) rhythm. Their findings reveal that delta-HFO

coupling emerges at seizure onset and termination, implying that the delta-HFO coupling may be

involved in transition mechanisms leading to epileptogenesis [89]. Even so, these disruptions in

cross-frequency coupling can provide insight into pathology and can be used as biomarkers to

better characterize and treat neurological disorders.

The greatest obstacle preventing a more widespread clinical use of HFO-based diagnosis

techniques are difficulties associated with recording the HFOs [90] and the sheer scale of the

recordings. Intracranial recordings are invasive and are only used as a last resort when all other

forms of intervention fail. Further, from the Nyquist criterion, HFOs require sampling in the

kilohertz range. Due to their spatial focus, recording HFOs then requires many spatially distributed

electrodes which produce massive high dimensional data sets which are difficult to process, calling

for the use of automated techniques of signal processing and machine learning.

1.3 Prediction of Treatment Outcome 12

1.3 Prediction of Treatment Outcome

Univariate analysis of electroencephalogram (EEG) activity has been used to differentiate between

healthy individuals and patients suffering from a wide range of neurological disorders. These

findings are usually significant at a group level and have had limited clinical translation. This

should be of no surprise as the human brain is a highly complex system, with no two brains wired

exactly the same. The goal of early diagnosis, planning of treatment and tracking of disease

progression is not easily achieved.

Conventional research is typically applied at a group level focusing on commonalities across

groups of people and often ignoring differences as outliers. The group approach often ignores the

nonlinearity of the individual response. In contrast, doctors working with patients have to make

clinical decisions about individuals by taking into consideration medical background, blood work,

and other available data in order to make an objective inference at the level of an individual rather

than the group.

There has been a growing interest in the use of analytical methods to allow for inference at the

individual level. One way of achieving it is through the use of supervised machine learning

predictive models such as support vector machines (SVMs). The goal of supervised learning is to

develop a discriminative function that can make classifications or predictions from a set of

features. Advantages of this method are: (1) it allows for multivariate characterization at the level

of the individual, potentially leading to results with higher clinical translation; (2) supervised

machine learning techniques can pick up on subtle, otherwise undetectable nonlinear differences

in the brain.

The three main clinical focus areas employing inference include diagnosing, predicting disease

onset and, more recently, predicting treatment outcome. There have been many studies on

diagnosing disorders [91, 92] and predicting disease onset in advance [93], and only a few studies

have focused on predicting treatment response in a machine learning context [94, 95]. Of the

studies performed on predicting treatment outcomes, the vast majority focused on individuals

suffering from major depression , and some work done on schizophrenia [94, 96]. Applying

1.3 Prediction of Treatment Outcome 13

machine learning techniques in the study of major depression is clinically impactful, as individuals

respond differently to treatments. For example, in major depression a third of the patients show no

improvements to antidepressant treatment [97].

Studies using structural magnetic resonance imaging (MRI) data in conjunction with SVMs to

predict treatment outcome in major depression were able to link treatment outcome to grey matter

presence with an accuracy of 88.9% (88.9% specificity and 88.9% sensitivity) [95]. In a follow up

study applied on white matter and grey matter and which included a larger sample size, features

were able to correctly predict treatment outcome 12 weeks in advance with an accuracy of 69.57%

and 65.22% respectively [98]. A similar study was applied on chronic schizophrenia, this time

using pretreatment EEG data to predict the response to clozapine [94]. In that study they were able

to achieve an accuracy of 85% on 23 subjects, providing support for predicting treatment outcome

from brain activity. In a study on schizophrenia, it was shown that scalp EEG features examining

combinations of power, and coherence measures are effective at predicting favorable treatment

response to antidepressant drug therapy [96].

The topic of treatment outcome prediction is of great importance for epilepsy applications, as

individuals respond differently to treatments. For example, roughly 30-40% of patients do not

respond to anticonvulsive drugs, and of those that do respond, not all respond in the same way.

Finding the most effective treatment can be an ordeal for patients, often resulting in unnecessary

complications. Anticonvulsive drugs can make the seizures worse and more frequent, and are

associated with numerous side-effects that can affect cognition and patients’ abilities to perform

[99-101]. Furthermore, patients can build up a tolerance to certain drugs over time [102], at which

point the trial-and-error search for a treatment resumes. It would go a long way towards improving

quality of life for patients if treatments could be tailored to them individually. While machine

learning techniques have been applied in epilepsy studies to classify seizure events [103], predict

the seizure onset [104], and most recently, identify seizure onset zone [105] , machine learning

methods that predict treatment outcome have not, to the best of our knowledge, been applied in

epilepsy research until now.

1.4 Objectives and Hypothesis 14

Unsuccessful drug trials and delayed treatments have a high impact on quality of life and are

expensive for both patients and the health care system. Determining a priori the most effective

treatment would help improve the lives of patients and reduce the financial burden associated with

anticonvulsive drug treatments. The first step for developing a successful treatment plan is

accurately diagnosing or characterizing the disorder. Our brains are not well suited for multivariate

correlations, but fortunately machine learning techniques have been designed specifically with this

purpose in mind.

1.4 Objectives and Hypothesis

1.4.1 Objectives

This thesis studies the use of iEEG-based features to track LFO-HFO interactions of cross-

frequency coupling (CFC) in a Mecp2-deficient mouse undertaking ACD therapy, to determine

treatment efficacy. Two commonly used machine learning techniques, Support Vector Machines

(SVMs) and Random Forests (RF), are trained and evaluated on time-based, normalized power

and cross-frequency coupling features to predict the likelihood of treatment outcome for

commonly used ACDs.

The main contributions of this thesis are as follows:

i. Detection and identification of epileptiform discharges in genetic model of epilepsy using

intracranial EEG recordings. Automated detection of discharge events is obtained from

several hours of time-series recordings in the Mecp2-deficient mouse model of Rett

Syndrome.

1.4 Objectives and Hypothesis 15

ii. Characterization of discharge durations over developmental time points. Not all

discharges are the same. Analyses of discharge durations show that long durations duration

discharges are better suited for tracking pathology.

iii. Identification and tracking of epileptiform activity over developmental time points.

Developing a successful treatment plan is vital in accurately diagnosing or characterizing

epileptiform discharges. LFO and HFO features of cross-frequency coupling obtained from

iEEG are used to track biomarkers of epileptiform pathology over development.

iv. Examine heterogeneity in drug treatment outcomes. Variability of ACD treatment outcome

across subjects is observed, suggesting that a patient-specific approach is necessary for

properly selecting the most appropriate ACD treatment.

v. Evaluate machine learning techniques for ranking the efficacy of drug treatment options.

SVM and RF machine learning methodologies are used to estimate likelihood scores for

successful treatment outcome based on time, normalized power, and cross-frequency

coupling features. In particular, features of cross-frequency coupling yield the most

effective a priori identification of appropriate ACD treatment.

This work has been published [26, 29, 71, 105-111].

1.4.2 Hypothesis

The hypotheses of this thesis are as follows:

H1: iEEG-based measures of delta – HFO phase-amplitude CFC in long duration discharges

can differentiate between epileptiform and non-epiletiform discharges, and can provide

patient-specific biomarkers of epileptiform activity.

1.5 Outline of Chapters 16

H2: A priori features of LFO-HFO CFC in conjunction with machine learning techniques

can accurately predict drug treatment outcome in a Mecp2-deficient mouse model of

Rett syndrome.

1.5 Outline of Chapters

The thesis is organized as follows:

Chapter 1 introduces the background and motivation for the study of electrical rhythms, and their

role in the diagnosis and prediction of epileptiform activity.

Chapter 2 provides a review of the methodology pertaining to the experimental protocols and

time-frequency analyses and machine learning techniques applied in this thesis.

Chapter 3 identifies and tracks the spatiotemporal progression of epileptiform activity in the low

frequency range.

Chapter 4 characterizes and tracks the progression and interaction of low and high frequency

range epileptiform discharges.

Chapter 5 presents a patient-specific protocol using machine learning techniques to predict

treatment outcome to various anticonvulsive drugs.

Chapter 6 discusses the significance and future directions of the results presented in this thesis.

17

Chapter 2

Methodology

The work presented here is subdivided into three sections: experimental setup, signal processing,

and machine learning. Where machine learning dealt with feature extraction, labeling, training and

evaluation. The recordings of iEEG were obtained from Mecp2+/- mice exhibiting spontaneous and

recurrent epileptiform discharges. These recordings came from pre- and post- mecp2 gene

reactivation, and pre- and post- drug treatments (Figure 1a).

Signal processing techniques of normalized time-frequency power, rhythm extraction, and

measures of cross-frequency coupling were applied to obtain features of pathology (Figure 1b).

Signal processing techniques consist of the continuous wavelet transform (CWT), ensemble

empirical mode decomposition (EEMD), and index of cross-frequency coupling (ICFC). In addition

to visualizing the time-frequency power distribution, CWT was also used to obtain the phase and

amplitude signals used to compute the comodulogram which provides a measure of cross-

frequency coupling. A focused measure of cross-frequency coupling was obtained by using

EEMD. EEMD extracted Low Frequency Oscillations (LFOs) and High Frequency Oscillations

(HFOs) avoiding the need to select specific frequency bands a priori. In addition to visualizing the

data and determining the treatment efficacy, signal processing techniques were also used to obtain

features for the machine learning prediction algorithm.

1.5 Outline of Chapters 18

Figure 1. Overview of experimental design, signal processing and machine learning methods.

(A) Experimental setup consisted of pre- and post- gene reactivation along with drug testing. (B) Signal processing

overview highlights the rhythm extraction using complex CWT and extraction of specific rhythms of interest using

EEMD for purposes of generating labels. (C) Machine learning overview highlighting the selection of machine

learning algorithms using t-SNE projections. Figure taken from [111].

2.1 Experimental Setup 19

Machine learning algorithms were evaluated on their ability to predict treatment outcome from the

iEEG-based features. Figure 1c provides an outline of the machine learning methodology used in

this study. Samples for training and evaluation consisted of epileptiform discharges obtained from

six mice pre and post treatment. The subjects were examined for discharges using predefined

selection criterion [26, 107]. Post-treatment recordings of long duration discharges were evaluated

for presence of significant delta – fast ripple cross-frequency coupling. A percentage score was

used to evaluate the efficacy of each ACD treatment. Treatments outcomes resulting in low

percentage of delta-HFO cross-frequency coupling were labeled as positive responders, whereas

those with high percentages were labeled as negative responders or non-responders. Two

commonly used machine learning algorithms: SVMs and RFs were trained on EEMD time-based,

normalized power and comodulogram features to predict the likelihood of successful treatment.

The accuracy of these predictions was compared using Receiver Operating Characteristic (ROC)

plots, and by evaluating the percentage of samples that predicted successful treatment outcomes.

2.1 Experimental Setup

All animal experimental procedures were approved by local ethics committees in accordance to

guidelines of the Canadian Council on Animal Care.

2.1.1 Animal Subjects

Experimental genotypes were produced by crossing female Mecp2+/- mice (Mecp2tm1.1Bird,

Jackson Laboratory, Bar Harbor, ME, USA) with male wild-type mice described previously [112,

113]. In total there are (n=6) female Mecp2+/- subjects. In the case of the mecp2 gene reactivated

model, the mice were generated by crossing female Mecp2+/- mice with male Rosa26-Esr/Cre

transgenic mice (Gt(ROSA)26Sortml(cre/ESR1)Tyj/J , Jackson Laboratories) [29, 113]. In total

there are (n=4) female mecp2 reactivated mice (rescue mice). Mice were subjected to experimental


procedures only once they displayed clear Rett-like behavioural traits and were studied between 3

and 23 months of age. All subjects were maintained on a pure C57Bl/6J background and housed

in a vivarium that was maintained at 22-23oC with a standard 12-hour light cycle commencing at

06:00.

2.1.2 Implantation Surgery

Animals were implanted with stainless steel, polyimide-insulated stainless steel electrodes (125

μm) following procedures described previously [21, 114]. Preconfigured microelectrodes were

implanted in the somatosensory cortex (Bregma, -0.8 mm; lateral, 1.8 mm; depth, 1.5 mm) with a

reference implanted superficially in the frontal brain area (Bregma +2.8mm; lateral, 1.8 mm; depth,

0.5 mm)(Figure 2b). Animals were allowed at least 7 days of post-surgical recovery time before

further experimentation. iEEG signals were amplified 1000x, bandpass-filtered (0.01 – 1000 Hz)

and digitized (Digidata 1300, Axon Instruments, Weatherford, TX, USA). Data were sampled at

60 kHz and stored using Clampfit 10.2 software (Axon Instruments). Recordings sessions lasted

from 30 min to 1 h to observe all of the behavior states.

2.1.3 iEEG Recording and Analysis

Recordings of iEEG were obtained from Mecp2+/- mice exhibiting spontaneous and recurrent

epileptiform discharges centered at roughly 6 – 10 Hz frequency range as described previously

[21, 115]. These recordings came from pre- and post- mecp2 gene reactivation, and pre- and post-

established and experimental anti-epileptic drug (ACD) treatment regiments (Figure 2). iEEG

signals were amplified 1000x, bandpass-filtered (0.01 – 1000 Hz) and digitized (Digidata 1300,

Axon Instruments, Weatherford, TX, USA). Data were sampled at 60 kHz and stored using

Clampfit 10.2 software (Axon Instruments). Recordings sessions lasted from 30 min to 1 hour to

observe all of the behavior states. Discharge events were identified in the recordings using an


automated detection technique (see 2.2.1) based on predefined selection criterion as described

previously [26, 107]. Data preprocessing consisted of down sampling to 4k Hz followed by

removal of 60 Hz line noise using a high order FIR notch filter with +/- 0.5 Hz cutoffs. Segments

with large amplitude muscle artifacts were excluded from analysis.

2.1.4 Established and Experimental Anticonvulsive Drug Treatments

Discharges observed under Mecp2 deficiency have been linked to cortical network

hyperexcitability induced by alterations in GABAergic neurons which in turn affects GABA-

related signaling [32]. Pharmacological interventions were obtained from Food and Drug

Administration (FDA) and experimental ACD sources and were based primarily on their

effectiveness on GABAergic circuits. The drugs and dosages used in this study were: Midazolam

at 0.5 mg/kg, Ganaxolone at 5 mg/kg, 4,5,6,7-tetrahydroisoxazolo[5,4-c]pyridine-3-ol (THIP) at 5

mg/kg, and Phenytoin at 30mg/kg. Phenytoin is an FDA approved ACDs used mainly for long-

term control of seizures. Phenytoin is believed to protect against seizures by causing a voltage-

dependent block of sodium channels, which prevents sustained high frequency firing of action

potentials [116]. Phenytoin has also been linked to synaptosomal transport of glutamate and

GABA which may explain their effectiveness in Mecp2-deficient mice [117]. Midazolam is fast

acting benzodiazepine typically reserved for emergency control of acute seizures, including status

epilepticus [118]. Midazolam acts on GABAA receptors, however, it does not directly activate

GABAA receptors, rather acts as an intermediary, like other benzodiazepines, to enhance the effect

of the GABA neurotransmitter [119]. Ganaxolone was selected for anticonvulsive effects arising

from its GABAA modulatory effects. Ganaxolone has been shown to bind to the GABAA receptor

to modulate and open chloride ion channels causing an inhibitory effect on neurotransmission and

reducing the chance of action potentials [120]. THIP which acts as a selective agonist for GABAA

receptor subunit delta was selected [121] and has previously been shown to have anticonvulsive

properties [32]. All drugs were dissolved in double distilled H20 and administered intraperitoneally

to the animals. All pharmacological treatments were applied for a period of one day followed by a

day of washout before administering any other pharmacological compound.


Figure 2. Mecp2-deficient mouse model of Rett Syndrome.

(A) Close-up of implanted stainless steel, polyimide-insulated stainless steel electrodes highlighting the arrangement.

The reference electrode was implanted superficially in the frontal cortex. The analyses presented here focused on the

recordings from the somatosensory cortex. (B) Representative time-series recording of an epileptiform discharge

observed in the Mecp2-deficient mouse model.

2.2 Signal Processing 23

2.2 Signal Processing

2.2.1 Automated Discharge Detection

Time-series iEEG traces were visually inspected by condition blinded investigator to confirm and

quantify the presence of discharge activity as previously described [26, 122, 123]. A discharge

event was defined as having durations of at least 0.4 seconds, an amplitude of at least 1.5-fold

background, and a frequency in the theta 6-10 Hz frequency range [26].

Using the established manual criteria [26, 122, 123], we developed an automated method for

detecting epileptiform discharge events and recording the time of occurrence and duration. The

method is similar to that outlined in [26, 124], the main exception being the additional amplitude

criterion for defining a discharge event. The first step of the automation is the application of a 6–

10 Hz FIR band pass filter to isolate the frequency band associated with the discharges. This

frequency band was chosen as it limited the effect of high delta power, while at the same time

capturing the increased theta power present during a discharge (Figure 3). The envelope of the

filtered signal was created by convolving a Gaussian kernel of 200-point aperture with the square

of the filtered data. The envelope peaks at the presence of strong 6–10 Hz power. Normal cortical

LFP signals within this frequency range rarely display high-amplitude rhythmic spiking (Figure

3a). As a result, the envelop peak reflects the presence of a discharge event (Figure 3b). We then

determined envelope thresholds specific to each subject with the emphasis of minimizing false

positives. Discharge durations were determined by finding the left and right inflection points of

detected events, which indicated the start and end points respectively of a discharge. The inflection

points were computed by convolving the envelope with the derivative of the Gaussian kernel

resulting in the rate of change plot shown in Figure 3c. Discharges less than 0.4 seconds were not

considered. The average areas under the curve 0.5 seconds to the left and right were compared to

the average area under the discharge. If the average area under the curve of the discharge was at

least twice that of the sides, then the discharge was accepted.


Figure 3. Automated epileptiform discharge detection applied on a 10 second iEEG segment.

(A) Sample in vivo iEEG recording with two discharge events shown. Red vertical lines indicate the boundary

separating discharge and inter-discharge regions. (B) Filtered 6-10Hz signal with the envelope (green) tracking the

power of the theta band which was used to distinguish between discharge and inter-discharge regions. (C) The

envelope rate of change demonstrating how the inflection points were used to identify the beginning and end of a

discharge.


2.2.2 Delta Power

Regions of low and high delta frequency power (0.5-4 Hz) as defined by [125] were determined.

The first step involved removing the 0.25 seconds time period preceding and succeeding artifact

events; where artifacts were characterized by voltages greater than 0.5 Volts. Then, a FIR band

pass filter with an order of 1000 was applied to isolate the 0.5-4 Hz delta frequency band. The

delta band signal was squared and averaged over 30 second intervals to obtain the delta power. To

discern the daily patterning of the delta signals, we normalized the delta power signals to a mean

of zero and variance of one. A Gaussian-based kernel similar to that defined for the automated

discharge detection; but with a 50-point aperture was applied on the filtered signal, generating an

envelope that represents the delta power. In order to discretize the signal, we used a zero threshold

to discern the difference between the high and low delta frequency power states.

2.2.3 Time-Frequency Analysis

Time-frequency power distributions were obtained by applying a continuous wavelet transform

(CWT) on iEEG time-series recordings (Figure 4), where �̂� is a time interval centered on an

epileptiform discharge. The CWT measures the correlation between a signal 𝑥(�̂�), and a wavelet

basis, 𝜓, for different scales, s, and time shifts 𝜏 [126] and is defined as,

𝑊(𝑠, 𝜏) = ∫ 𝑥(𝑡)𝜓𝑠,𝑡∗ (𝑡)𝑑𝑡

�̂� (1.1)

where,

𝜓𝑠,𝑡∗ (𝑡) =

1

√𝑠𝜓𝑜 (

𝑡−𝜏

𝑠) (1.2)

is the basis function with * denoting the complex conjugate. The basis function used here is the

complex Morlet wavelet defined according to the following form,

𝜓𝑜(𝑡) =1

√2𝜋𝑒𝑥𝑝 (𝑖𝜔𝑐𝑡 −

𝑡2

2) (1.3)


The scales were transformed to appropriate frequencies 𝑓, from the angular frequency 𝜔𝑐, using

the relation 𝜔𝑐 = 2𝜋𝑓𝑠 = 5.1 𝑟𝑎𝑑/𝑠. The result of the CWT yielded complex valued coefficient

matrix,

𝑊(𝑓, 𝑡) = 𝑤(𝑓, 𝑡) + 𝑖�̃�(𝑓, 𝑡) (2.1)

for which the magnitude was obtained as a measure of power over time and frequency. The

frequencies ranged from 1 to 600 Hz by 1 Hz step size, and the times correspond to the duration

of the recording 𝑥(�̂�).

To visualize the high and low frequencies on the same scale, the time-frequency vectors were

further z-score normalized according to the following,

𝑊𝑛𝑜𝑟𝑚(𝑓, 𝑡) =|𝑊(𝑓,𝑡)|−𝜇(𝑓)|

𝑡2𝑡1

𝜎(𝑓)|𝑡2𝑡1

(2.2)

where variables 𝜇 and 𝜎 represent the mean and standard deviation from wavelet coefficient

magnitudes for each corresponding frequency taken from a two second baseline segment �̂�

obtained prior to seizure onset (i.e. �̂� ∈ [𝑡1, 𝑡2]), see Figure 4b,c.

2.2.4 Comodulogram

Cross-frequency coupling for a time interval �̂�, is proposed as a composite complex-values signal

𝑆𝐶𝐹𝐶(�̂�) consisting of an amplitude time-series of one higher frequency 𝐴(�̂�, 𝑓𝐻) with a lower

frequency of phase time-series 𝜑(�̂�, 𝑓𝐿) as shown (see Figure 5),


Figure 4. Normalized time-frequency analysis of discharges observed during epileptiform events.

(A) iEEG recording of discharge events shows HFO activity during the peak of the discharge. (B) Standard time-

frequency analysis of discharges using CWTs does not reveal any HFOs due to 1

𝑓 power scaling. (C) Z-score

normalization of CWT coefficients using a baseline region prior to seizure onset corrects for the power scaling and

highlights the presence of HFOs.

𝑆𝐶𝐹𝐶(�̂�, 𝑓𝐻 , 𝑓𝐿) = 𝐴(�̂�, 𝑓𝐻)𝑒𝑗𝜑(�̂�,𝑓𝐿) (3.1)

The time-series of the amplitude envelope 𝐴(�̂�, 𝑓𝐻) and instantaneous phase 𝜑(�̂�, 𝑓𝐿) were

determined from the respective complex wavelet coefficients,


Figure 5. Step-by-step computation of the ICFC measure for a particular HFO-LFO frequency pair.

(A) Obtain iEEG time segment centered on an epileptiform discharge. (B) CWT-based time-frequency computation

generates coefficients across frequencies 1 to 600 Hz with a 1 Hz step size. (C) Extraction of an HFO and LFO signal

from the CWT coefficients. (D) Computation of the envelope from the HFO signal and phase for the LFO. (E) The

envelope amplitude values are binned in 20 degree phase intervals to obtain a histogram. Amplitude-phase histograms

that deviate from the uniform distribution lead to increased ICFC measurements.


𝐴(�̂�, 𝑓𝐻) = |𝑤(�̂�, 𝑓𝐻) + 𝑗�̃�(�̂�, 𝑓𝐻)| (3.2)

𝜑(�̂�, 𝑓𝐿) = 𝑎𝑟𝑐𝑡𝑎𝑛�̃�(�̂�,𝑓𝐿)

𝑤(�̂�,𝑓𝐿) (3.3)

The coupling of the amplitude of a higher frequency signal, 𝐴(�̂�, 𝑓𝐻) for a phase of a lower

frequency signal 𝜑(�̂�, 𝑓𝐿) was assessed over a range of frequency pairs using the algorithm

proposed by Tort et al. [127]. The 𝜑(�̂�, 𝑓𝐿) signal was segmented into 20 degree intervals resulting

in 𝑁 = 18 bins. Within each bin the amplitude envelopes were averaged 〈𝐴(�̂�, 𝑓𝐻)〉. The mean

amplitude was normalized by the sum over all mean amplitudes in each phase bin, according to,

𝑝𝑗(�̂�, 𝑓𝐻 , 𝑓𝐿) =〈𝐴(�̂�,𝑓𝐻)〉𝑗

∑ 〈𝐴(�̂�,𝑓𝐻)〉𝑘𝑁𝑘=1

(4.1)

producing a probability density value jp , where j indicates the phase bin number which is

associated with 𝑓𝐿. Then an entropy measure defined by:

𝐻(�̂�, 𝑓𝐻, 𝑓𝐿) = − ∑ 𝑝𝑗(�̂�, 𝑓𝐻, 𝑓𝐿)𝑙𝑜𝑔(𝑝𝑗(�̂�, 𝑓𝐻 , 𝑓𝐿))𝑁𝑗=1 (4.2)

was determined and normalized to obtain the index of cross-frequency coupling,

𝐼𝐶𝐹𝐶(�̂�, 𝑓𝐻 , 𝑓𝐿) =𝐻𝑚𝑎𝑥−𝐻(�̂�,𝑓𝐻,𝑓𝐿)

𝐻𝑚𝑎𝑥 (4.3)


Where 𝐻𝑚𝑎𝑥 is the maximum possible entropy value, which for a uniform distribution has a value

𝐻𝑚𝑎𝑥 = 𝑙𝑜𝑔 𝑁. The 𝐼𝐶𝐹𝐶 measure described by equation 4.3 yields the comodulogram when

examined over a range of frequencies. The computation of the ICFC is summarized in Figure 5.

2.2.5 Empirical Mode Decomposition

The Ensemble Empirical Mode Decomposition (EEMD) was applied extract rhythms which were

used with the time-based features, in addition to the computation of ICFC surrogates for the selected

delta, theta LFO and fast ripple HFO rhythms [128]. EEMD is applied to decompose signals into

rhythms and it does not require a priori knowledge of the frequency ranges of the rhythms, as

would have been needed for bandpass filtering. Furthermore, EEMD is better suited for extracting

rhythms with time-varying frequencies that span large frequency bands.

The EEMD separates a signal into multiple rhythms referred to as intrinsic mode functions (IMFs).

The decomposition as seen in Figure 6 is adaptive and dependent on local time characteristics of

the data. The general method works by decomposing a time-series signal into nearly orthogonal

components (IMFs) using a process called sifting. This methods can be summarized by two steps:

i. The upper Xmax(t), and lower Xmin(t), envelopes of the signal are created by connecting

a smooth spline to maxima and minima, respectively of a given signal X(t).

ii. The mean of the two envelopes is then subtracted from the data to get a difference signal,

𝑋1(𝑡) = 𝑋(𝑡) − 𝑋𝑚𝑎𝑥(𝑡) + 𝑋𝑚𝑖𝑛(𝑡)

2 (5)

This process is repeated by setting the new X(t) = X1(t) until a stopping condition is met.

Generally, the stopping condition is met after a certain number of iterations are reached.

The extracted IMF is subtracted from the original signal resulting in a residual signal R(t). Then a

new IMF is found setting X(t) = R(t) and reapplying the sifting process. The search for new IMFs


is complete once the amplitude of the residual signal is smaller than some predefined value. At

this point the resulting set of IMFs can be summed together to reconstruct the original signal.

EMD is not truly orthogonal and often neighbouring IMF may have overlapping components, an

artefact known as mixing. In an attempt to compensate for the mixing that can occur, an extension

to the EMD known as ensemble empirical mode decomposition (EEMD) has been made [128].

EEMD works by iteratively computing the EMD on a signal with added Gaussian noise. After

many iterations the noise cancels out and what is left is usually a well-defined signal with less

mixing.

Figure 6. Decomposition of time-series recordings using EEMD.

(A) Sample time-series signal composed from two signals of different frequencies. (B) The mean signal enclosed by

the upper and lower envelopes of the original signal is obtained. (C) The difference of the original and mean signals

is used to obtain IMF. (D). The remaining signal, known as the residual, is obtained by subtracting the original signal

by IMF1. This process continues until the residual signal equals zero or is sufficiently small as defined by a stopping

threshold.

2.3 Machine Learning Algorithms 32

In our approach the noise variance σ2 was set to 0.2, and the number of iterations was set to 100,

resulted in 9 IMFs. The IMFs spanned over a large range of frequencies with IMF1 representing

the highest and IMF9 being the lowest frequency rhythm. The theta rhythm LFOs (6-10 Hz

frequency range) were obtained from IMF 8, the delta LFOs (2-5 Hz frequency range) were

obtained by combining IMF8 and IMF9 and the fast ripple HFOs (400 - 600 Hz frequency range)

were obtained from the IMF2.

2.3 Machine Learning Algorithms

2.3.1 Support Vector Machines

Support Vector Machines (SVMs) have been used extensively on classification problems on

scientific data and have shown superior performance over traditional statistical and neural

classifiers. SVMs have the ability to minimize both structural and empirical risk leading to better

generalization for new data classification even with limited training dataset. Conceptually, the

SVM algorithm determines a nonlinear boundary in the feature vector space by computing a

maximum margin, linear boundary in the higher-dimensional space as shown in Figure 7. A

transformation to the higher-dimensional space is achieved using kernels [129]. SVM work by

applying a n-dimensional transformation to construct a hyperplane that maximizes the separation

margin between input data classes [129]. SVM binary classification requires the computation of a

maximum margin separation specified by the following quadratic programming (QP) problem:

𝑓(𝒙, 𝜶, 𝑏) = {±1} = 𝑠𝑔𝑛 (∑ 𝛼𝑖𝑦𝑖𝑘(𝒙𝑖 , 𝒙) + 𝑏

𝑙

𝑖=1

) (6.1)


𝑚𝑎𝑥 𝑊(𝜶) = ∑ 𝛼𝑖

𝑙

𝑖=1

−1

2∑ ∑ 𝛼𝑖𝛼𝑗𝑦𝑖𝑦𝑗𝑘(𝒙𝑖 , 𝒙)

𝑙

𝑗=1

𝑙

𝑖=1

(6.2)

subject to,

0 ≤ 𝛼𝑖 ≤ 𝐶 (6.3)

and,

∑ 𝑦𝑖𝛼𝑖 = 0

𝑙

𝑖=1

(6.4)

The number of training samples is denoted by l, α is a vector of l variables, where each component

𝛼𝑖 corresponds to a training point (𝑥𝑖, 𝑦𝑖), and C is the soft margin parameter which dictates the

influence of outliers in the training data. The kernel 𝑘(𝑥𝑖, 𝑦𝑖) is a typically a nonlinear

transformation of the input space to a higher dimensional feature space. For the purposes of this

work the Gaussian kernel 𝑘(𝑥𝑖, 𝑦𝑖) = 𝑒𝑥𝑝(−𝛾‖𝑥 − 𝑥𝑖‖2) was used. The QP problem determines

a vector α, where each element specifies the weight of each data. A select number of support

vectors (SVs), which are data points closest to the boundary are used to represent the margin by

having α values greater than zero. Since only a subset of the data, SVs, is required for representing

the maximum separation boundary, that makes SVMs sell suited for small data sets with large

feature vectors.

SVMs with Radial Basis Function (RBF) kernels were implemented using the LibSVM library

[130], and were trained to predict treatment outcomes for several ACDs. Training consisted of 5-

fold cross-validation to select the best regularization parameter C and the RBF gamma parameter

γ. The C parameter provides a tradeoff between misclassification of training examples versus the

simplicity of the decision surface. A low C makes a smooth decision surface, while a high C places

the aim at classifying all training examples correctly by providing freedom to the model to select

more SVs. The gamma parameter defines the influence of a single training example, with a low


values meaning farther proximity of influence and high values having a closer proximity of

influence. An exponential grid search over the parameters C (2-3, 2-4, … , 210) and γ (2-18 , 2-17 , …

, 2-3) was then performed. The parameter pair yielding maximum performance on the training set

was selected for testing.

Figure 7. Description of SVM algorithm.

A) SVMs determine a separating hyperplane between two classes by applying optimization techniques to find the

margin with maximum separation. B) To facilitate the finding of the maximum separation, the initial input space

data is transformed to a higher dimensional feature space where a linear separation of classes can be achieved using

a hyperplane. This transformation is achieved with kernalization, and this higher dimensional hyperplane can be

represented as a nonlinear separating margin in the lower dimensional input space.


2.3.2 Random Forest

Random Forest (RF) classifier, proposed initially by Breiman [131], was used as one of the

predictive models for ACD treatment outcome. The RF is an ensemble learning method for

classification and other tasks. RFs work by constructing a multitude decision trees, where each

tree independently trained to make classification decisions. The core of the random forest classifier

is the binary decision tree, a data type that stores elements hierarchically in nodes (Figure 8). Each

decisions tree is grown on different bootstrapped sample collections (i.e. randomly drawn

instances with replacement form the original dataset) on a randomly selected subset of all available

predictors. During training a subset of the trees are randomly selected and trained on the dataset.

On testing, the trees perform majority voting to make the final class selection,

𝑃(𝑐|𝐯) = ∑ 𝑃𝑡𝑇𝑡=1 (𝑐|𝐯) (7)

where 𝑃(𝑐|𝐯) is the conditional probability of class label 𝑐 given input feature set 𝐯, and 𝑃𝑡 is

the conditional probability for each randomly drawn tree.

The advantage of the RF approach over other machine learning algorithms is that the random

s

machine learning for prediction of anticonvulsive drug ... · sinisa colic doctor of philosophy the...

Documents