machine learning for prediction of anticonvulsive drug ... · sinisa colic doctor of philosophy the...

126
Machine Learning for Prediction of Anticonvulsive Drug Treatment Outcomes in Mecp2-Deficient Mice by Sinisa Colic A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy The Edward S. Rogers Sr. Department of Electrical and Computer Engineering University of Toronto © Copyright by Sinisa Colic 2017

Upload: others

Post on 28-Jan-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

  • I

    Machine Learning for Prediction of Anticonvulsive Drug Treatment Outcomes in Mecp2-Deficient Mice

    by

    Sinisa Colic

    A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy

    The Edward S. Rogers Sr. Department of

    Electrical and Computer Engineering University of Toronto

    © Copyright by Sinisa Colic 2017

  • ii

    Machine Learning for Prediction of Anticonvulsive Drug Treatment Outcomes in Mecp2-Deficient Mice

    Sinisa Colic

    Doctor of Philosophy

    The Edward S. Rogers Sr. Department of

    Electrical and Computer Engineering University of Toronto

    2017

    Abstract

    Anticonvulsive drug (ACD) treatments produce inconsistent outcomes, often necessitating patients

    to go through several drug trials before a successful treatment can be found. In this thesis we apply

    a novel approach, using machine learning techniques to predict epilepsy treatment outcomes of

    commonly used ACDs. Machine learning algorithms were trained and evaluated using features

    obtained from intracranial electroencephalogram (iEEG) recordings of the epileptiform discharges

    observed in Mecp2-deficient mouse model of the Rett Syndrome. Our work on Mecp2-deficient

    mice focuses on low frequency oscillations (LFO), high frequency oscillations (HFO) and their

    interactions through cross-frequency coupling (CFC) to reveal iEEG based biomarkers that track

    epileptic seizure pathology. Our findings revealed: variability across discharge events using iEEG

    recordings, progression of longer duration discharges over five developmental time points, and the

    increased cross-frequency coupling index ICFC of the delta (2-5 Hz) rhythm with the fast ripple

    (400-600 Hz) rhythm in discharge events. These results suggest a link between long duration

    discharges with elevated ICFC to the epileptic seizure pathology. Using the ICFC to label post-

    treatment outcomes we trained Support Vector Machines (SVM) and Random Forest (RF) machine

  • iii

    learning classifiers on time-based, normalized power and CFC comodulogram features to predict

    the efficacy of ACD treatments. The results indicate that the performance of the comodulogram

    features yielded better predictions and were further improved when combined with time-based

    features. Hence, machine learning techniques were able to rank ACDs by estimating likelihood

    scores for successful treatment outcome. Identifying the most appropriate ACD treatment a priori

    would reduce the burdens of drug trials and provide patient specific treatment options that could

    lead to substantial improvements in patient quality of life.

  • iv

    Acknowledgements

    To my supervisor Dr. Berj Bardakjian for your constant encouragement, compromise and abundant

    optimism that always manages to light the way through any obstacles.

    To my supervisory committee, Dr. Quaid Morris, Dr. Kostas Plataniotis, and Dr Liang Zhang, for

    your invaluable comments, feedback and unique perspectives. To my research collaborators, Dr.

    James Eubanks, Rob Wither and Min Lang for all your experimental data contributions,

    knowledge, and helpful suggestions.

    To my fellow lab group members, past and present, for their helpful discussions, advice and all the

    fun lab adventures over the years. Thank you: Mark Aquilino, Vanessa Breton, Dr. Marija Cotic,

    Joshua Dian, Fihras Farah, Vasily Grigorovsky, Dr. Mirna Guirgis, Trevor Hilton, Daniel Jacobs,

    Dr. Eunji Kang, Angela Lee, Dr. Ryan McGuinn, Dr. Demitre Serletis, Dr. David Stanely, Sam

    Talasila, Uilki Tufa and Dr. Osbert Zalay. Special thanks to: Joshua Dian for setting up and

    maintaining the Big Bang Theory computer cluster, Dr. Ryan McGuinn for providing the ICFC

    surrogate analysis implementation, and Dr. Osbert Zalay for the multiple MATLAB toolboxes.

    To my friends, for their encouragement, endless jokes and intermittent social distractions that have

    provided ample breaks from research and opportunities to refresh.

    To my family for their unconditional love and support. My parents for making tremendous

    sacrifices without which I would not be here today. Most importantly to my beautiful wife for all

    the times spent proofreading my writing and for patiently standing by my side on this roller coaster

    ride called a PhD.

  • v

    Table of Contents

    List of Abbreviations ............................................................................................................. viii

    List of Figures ............................................................................................................................ x

    List of Tables ......................................................................................................................... xiii

    Chapter 1 .................................................................................................................................... 1

    Introduction ................................................................................................................................ 1

    1.1 Epilepsy .......................................................................................................................... 1

    1.1.1 Animal Models of Epilepsy ............................................................................... 3

    1.1.2 Mecp2-Deficient Mouse Model of Epilepsy ...................................................... 4

    1.1.3 Mechanisms of Mecp2 Deficiency .................................................................... 4

    1.2 Electrical Rhythms in the Brain ..................................................................................... 5

    1.2.1 Low Frequency Oscillations .............................................................................. 6

    1.2.2 High Frequency Oscillations .............................................................................. 8

    1.2.3 LFO-HFO Cross-Frequency Coupling ............................................................ 10

    1.3 Prediction of Treatment Outcome ................................................................................ 12

    1.4 Objectives and Hypothesis ........................................................................................... 14

    1.4.1 Objectives ........................................................................................................ 14

    1.4.2 Hypothesis ........................................................................................................ 15

    1.5 Outline of Chapters ...................................................................................................... 16

    Chapter 2 .................................................................................................................................. 17

    Methodology ............................................................................................................................ 17

    2.1 Experimental Setup ...................................................................................................... 19

    2.1.1 Animal Subjects ............................................................................................... 19

    2.1.2 Implantation Surgery ....................................................................................... 20

    2.1.3 iEEG Recording and Analysis ......................................................................... 20

  • vi

    2.1.4 Established and Experimental Anticonvulsive Drug Treatments .................... 21

    2.2 Signal Processing ......................................................................................................... 23

    2.2.1 Automated Discharge Detection ...................................................................... 23

    2.2.2 Delta Power ...................................................................................................... 25

    2.2.3 Time-Frequency Analysis ................................................................................ 25

    2.2.4 Comodulogram ................................................................................................ 26

    2.2.5 Empirical Mode Decomposition ...................................................................... 30

    2.3 Machine Learning Algorithms ..................................................................................... 32

    2.3.1 Support Vector Machines ................................................................................ 32

    2.3.2 Random Forest ................................................................................................. 35

    2.3.3 Feature Extraction ............................................................................................ 37

    2.3.4 Feature Selection and Reduction ..................................................................... 38

    2.3.5 Evaluation ........................................................................................................ 40

    2.4 Statistical Analyses ...................................................................................................... 42

    2.4.1 Gamma Fit ....................................................................................................... 42

    2.4.2 Surrogates ........................................................................................................ 43

    2.4.3 Standard error and t-tests ................................................................................. 44

    Chapter 3 .................................................................................................................................. 45

    Long Duration Discharges as Biomarkers of Epileptiform Pathology .................................... 45

    3.1 Analysis of Discharge Durations ................................................................................. 45

    3.2 Gamma Distribution ..................................................................................................... 48

    3.3 Clustering ..................................................................................................................... 51

    3.4 Delta and Theta LFOs .................................................................................................. 52

    3.5 Long Duration Discharges ........................................................................................... 55

    3.6 Discussion .................................................................................................................... 56

    Chapter 4 .................................................................................................................................. 61

  • vii

    LFO-HFO Cross-Frequency Couplings as Biomarkers of Epileptiform Pathology ................ 61

    4.1 HFO Activity ............................................................................................................... 62

    4.2 Cross-Frequency Coupling .......................................................................................... 66

    4.3 Mecp2 Gene Reactivation ............................................................................................ 71

    4.4 Treatment Outcomes of ACDs ..................................................................................... 74

    4.5 Discussion .................................................................................................................... 76

    Chapter 5 .................................................................................................................................. 79

    A Machine Learning Approach to Treatment Outcomes ......................................................... 79

    5.1 Dataset .......................................................................................................................... 79

    5.2 Feature Selection .......................................................................................................... 81

    5.3 Training ........................................................................................................................ 84

    5.4 Treatment Prediction .................................................................................................... 84

    5.5 Discussion .................................................................................................................... 88

    Chapter 6 .................................................................................................................................. 91

    Conclusions and Future Work ................................................................................................. 91

    6.1 Significant Contributions ............................................................................................. 92

    6.2 Future Directions ......................................................................................................... 93

    References ................................................................................................................................ 99

  • viii

    List of Abbreviations

    ACD Anticonvulsive Drugs

    CFC Cross-Frequency Coupling

    CRG Cognitive Rhythm Generator

    CWT Continuous Wavelet Transform

    DBS Deep Brain Stimulation

    EEG Electroencephalogram

    EEMD Ensemble Empirical Mode Decomposition

    EMD Empirical Mode Decomposition

    FDA Food and Drug Administration

    FIR Finite Impulse Response

    FN False Negative

    FP False Positive

    FPR False Positive Rate

    GAERS Genetic Absence Epilepsy Rat from Strasbourg

    HFO High Frequency Oscillations

    HMM Hidden Markov Model

    ICFC Index of Cross-Frequency Coupling

    iEEG Intracranial Electroencephalogram

  • ix

    IMF Intrinsic Mode Function

    LFO Low Frequency Oscillations

    LFP Local Field Potential

    MECP2 Methyl-CpG-Binding Protein 2

    ML Machine Learning

    MRI Magnetic Resonance Imaging

    mRMR Minimum Redundancy Maximum Relevance

    PCA Principle Component Analysis

    QP Quadratic Programming

    RBF Radial Basis Function

    REM Rapid Eye Movement

    RF Random Forest

    ROC Receiver Operating Characteristic

    ROI Region of Interest

    SLE Seizure-Like Events

    SVM Support Vector Machines

    TN True Negative

    TP True Positive

    TPR True Positive Rate

    T-SNE t-Distributed Stochastic Neighbor Embedding

  • x

    List of Figures

    Figure 1. Overview of experimental design, signal processing and machine learning methods. 18

    Figure 2. Mecp2-deficient mouse model of Rett Syndrome. ....................................................... 22

    Figure 3. Automated epileptiform discharge detection applied on a 10 second iEEG segment. . 24

    Figure 4. Normalized time-frequency analysis of discharges observed during epileptiform

    events. ........................................................................................................................................... 27

    Figure 5. Step-by-step computation of the ICFC measure for a particular HFO-LFO frequency

    pair. ............................................................................................................................................... 28

    Figure 6. Decomposition of time-series recordings using EEMD. .............................................. 31

    Figure 7. Description of SVM algorithm. .................................................................................... 34

    Figure 8. Summarization of the Random Forest algorithm. ........................................................ 36

    Figure 9. Features used for training the treatment prediction algorithms. ................................... 39

    Figure 10. ROCs used for evaluation of treatment outcome predictions for different machine

    learning algorithms. ...................................................................................................................... 41

    Figure 11. Gamma distribution comparison for different alpha (α) values. ................................ 43

    Figure 12. Time-series iEEG recordings obtained from Mecp2-deficient mouse model showing

    changes of discharge occurrence and duration with age. .............................................................. 46

    Figure 13. Average discharge occurrence, duration and frequency across five developmental

    points in time. ................................................................................................................................ 47

    Figure 14. Histograms of discharge durations and inter-discharge durations with the

    corresponding gamma fit overlaid in red after 14-18 months of development. ............................ 49

  • xi

    Figure 15. Comparison of alpha values obtained from ictal and interictal events from Mecp2-

    deficient mouse model against values obtained from patients with absence epilepsy and other

    animal models. .............................................................................................................................. 50

    Figure 16. Distributions of discharge occurrences over 24 hour periods obtained from mice 14-

    18 months of age. .......................................................................................................................... 51

    Figure 17. Analysis of the presence of clustering using Ripley’s K statistic. ............................. 52

    Figure 18. Comparison of discharge counts per hour for the day/night cycles and high/low delta

    frequency power. ........................................................................................................................... 54

    Figure 19. Comparison of discharge and inter-discharge durations for high and low delta

    frequency power regions. .............................................................................................................. 55

    Figure 20. Tracking of long and short duration discharge counts across five developmental time

    points. ............................................................................................................................................ 57

    Figure 21. Z-score normalized time-frequency distribution of a discharge. ................................ 63

    Figure 22. Progression of time-frequency distribution across five developmental time points. . 64

    Figure 23. Quantization of fast ripple HFO presence associated with discharges. ...................... 65

    Figure 24. Comodulogram of LFO-HFO cross-frequency coupling in a Mecp2-deficient mouse.

    ....................................................................................................................................................... 68

    Figure 25. EEMD extraction of delta, theta and fast ripple rhythms for the purposes of

    computing cross-frequency coupling index. ................................................................................. 69

    Figure 26. Tracking of phase-amplitude, cross-frequency coupling of delta – HFO and theta –

    HFO in long and short duration discharges over five developmental time points. ....................... 70

    Figure 27. Characterization of discharges in Mecp2-deficient mice pre- and post- mecp2 gene

    reactivation. ................................................................................................................................... 72

  • xii

    Figure 28. Phase-amplitude cross-frequency coupling of delta – HFO and theta – HFO in long

    and short duration discharges compared pre- and post- mecp2 gene reactivation. ....................... 73

    Figure 29. Delta-HFO CFC in long duration discharges used as a biomarker to evaluate

    treatment outcome. ........................................................................................................................ 75

    Figure 30. Low dimensional feature projections using t-SNE. Each subjects was identified by a

    unique colour. ............................................................................................................................... 82

    Figure 31. Low dimensional feature projections for the anticonvulsive drug THIP. .................. 83

    Figure 32. ROC evaluation of feature sets and machine learning methodologies. ...................... 85

    Figure 33. Predicted likelihood of favorable treatment outcome across four commonly used

    ACDs. ............................................................................................................................................ 87

    Figure 34. Combined LFO and HFO features identify channels in the ROI. .............................. 96

    Figure 35. Schematic of responsive neuromodulation protocol, combining the CRG model of

    Seizure-Like Events with RBF, periodic, random-repetitive and random modulation modalities.

    ....................................................................................................................................................... 98

  • xiii

    List of Tables

    Table 1. Examination of delta-HFO phase-amplitude coupling pre and post anticonvulsive drug

    administration. .............................................................................................................................. 80

    Table 2. Summary of feature reduction applied for THIP drug treatment prediction. ................ 88

  • 1

    Chapter 1

    Introduction

    1.1 Epilepsy

    Epilepsy is characterized by disruption in the operation of normal brain activity, often described

    as electrical storms in the brain. The word “epilepsy” is derived from Latin and Greek words for

    “seizure” or “to seize upon” [1]. As a neurological disease, epilepsy has no origin and can be traced

    as far back in time as medical records existed. One of the earliest records of the disease can be

    found in the Babylonian medical text from 1100BC which details the treatment and likely

    outcomes and characterizes many features of the different seizure types [2]. Due to their limited

    biomedical understanding, the Babylonians attributed the seizures to possession by evil spirits and

    called for treatment through spiritual means. This sort of spiritual thinking continued throughout

    the Greek and Roman times, well into the modern age. Treatments often consisted of spiritual

    means - in the dark ages, epilepsy was associated with demons and it was believed that drilling a

    hole into the skull would expel the demon and provide treatment. These notions of evil spirits and

    demons were common place well into the 17th century.

    The landscape started to change in the mid-1800s when anticonvulsive drugs (ACDs) were

    discovered [1]. As our knowledge of the brain improved and pharmacological treatments become

    available, scientific reason prevailed. The associations to evil spirits slowly disappeared, and were

    replaced by scientifically founded postulations.

  • 1.1 Epilepsy 2

    The scientific successes of the 20th century revealed epilepsy to be a dynamic neurological disease

    [3], associated with increased hyperexcitability which leads to disruption in normal brain activity.

    Disruptions in brain processes leads to abnormal neuronal activities, known as seizures, which can

    occur localized or generalized, with widely varying symptoms and manifestations. The underlying

    causes have been organized into three subgroups: genetic, structural-metabolic and unknown

    etiology [4].

    Epilepsy afflicts approximately 1.8% of the population [5]. The vast majority of patients can be

    treated with ACDs; however, variability in epilepsy etiology translates to variability in treatment

    outcome. Thus, not all patients respond favorably to anticonvulsive drugs. Approximately 20-40%

    of patients develop what is known as intractable epilepsy, where medication does not lead to

    seizure freedom [6, 7]. For these patients, the only treatment options are surgery or

    neuromodulation.

    The variability in ACD treatment outcome can best be seen in the treatment of epileptic seizures

    in Rett Syndrome. Rett Syndrome is defined as a neurodevelopmental disorder with one of the

    most common causes of mental retardation in females. It is an X-linked disorder that primarily

    affects girls at a rate of approximately 1:10,000 live female births. Rett Syndrome is caused by

    mutations in the gene encoding Methyl-CpG-Binding Protein 2 (MECP2) [8]. Unlike most forms

    of epilepsy, the epilepsy resulting from Rett Syndrome typically does not respond favorably to

    common ACDs and can require an exhaustive search to find an effective treatment option [9, 10].

    Our brief glimpse into the history of epilepsy reveals that when little was known about the disorder,

    the prevailing belief was that it was caused by daemons and evil spirits. As new information

    became available and our scientific understanding grew, we learned that there is a physical basis

    for the disorder and possible treatments. Further understanding of epileptogenesis will only lead

    to improved treatment options and better treatment selection.

  • 1.1 Epilepsy 3

    1.1.1 Animal Models of Epilepsy

    Investigation of epileptogenesis and evaluations of novel treatments is mainly achieved through

    the study of animal models of epilepsy. A large variety of animal models have been created, which

    include: pharmacological (e.g. pilocarpine, kainite), electrical (e.g. kindling), genetic (e.g. knock-

    out mice), and injurious (e.g. trauma, hypoxia, stroke); each designed to mimic the equally

    numerous types and causes of epilepsy in humans [11]. These various models of epilepsy can be

    further categorized as: in-vitro and in vivo, with each having its own advantages and disadvantages.

    The in-vitro models, including brain slices, cell cultures and molecular assays, provide a reduced

    biological system and are ideally suited for obtaining precise cellular recordings and

    pharmacological responses of the cell. The low magnesium hippocampal slice model is well

    established in literature owing to its reliable effects in-vitro and ability to generate spontaneous

    and recurrent seizure-like events (SLEs) that resemble SLEs observed in vivo [12]; however, like

    all slice models, it is limited to measuring local network interactions which are not a true

    representation of the brain. Other models have been generated to get around the limitation of local

    network interactions, for example the whole-hippocampal slice model records directly from the

    intact hippocampus and preserves more of the network [13, 14]. In addition to being confined to

    local networks, in-vitro models are difficult to maintain for long periods of time and therefore are

    generally limited to short term recordings.

    Unlike in-vitro models, in vivo models are not limited to local network connections, can be

    maintained for long term studies, and also exhibit behavioural and electroencephalographic seizure

    like events (discharges) that more closely mimic the clinical features of human epilepsy [11]. The

    kindling and kainic acid models are two commonly used in vivo models of epilepsy which work

    by applying pharmacological and electrical stimulation, respectively, to induce SLEs [15, 16],

    resulting in irreversible brain damage [15, 16]. The kindling technique may be employed on a wide

    variety of species, such as dogs, rabbits, cats and monkeys [16-19]. Although they are effective at

    generating SLEs, they result in significant brain damage [15, 16]. The Mecp2-deficient mouse

    model is an in vivo model of epilepsy that avoids the brain damage resulting from the kainic acid

    and kindling techniques. Furthermore, the Mecp2-deficient mouse model is obtained using genetic

  • 1.1 Epilepsy 4

    knock-out mice that can be treated, or rescued, using gene reactivation techniques, thus making it

    ideally suited for studying the effects of anticonvulsive drug (ACD) treatment outcomes.

    1.1.2 Mecp2-Deficient Mouse Model of Epilepsy

    The Mecp2-deficient mouse model is an in vivo model of Rett Syndrome. It recapitulates many of

    the behavioural and neurological deficits observed clinically in Rett Syndrome and shows

    spontaneous and recurrent discharges similar to those observed in other epilepsy models [20].

    These discharges exhibit absence-like characteristics and are severely attenuated with the

    administration of ethosuximide, an anti-absence drug [21]. It has been suggested that these

    rhythmic spike and wave discharges result from thalamo-cortical network dysfunction, as has been

    reported in other rodent models with absence-like discharges [22, 23]. Mecp2-deficient mouse

    models result from either a lack of Mecp2 or the expression of a relevant mutant form of Mecp2

    [24]. One of the earliest Mecp2-deficient mouse models is the Mecp2tm1.1Bird model, which results

    from the excision of exons 3 and 4 of the Mecp2 gene, resulting in deletion of the amino-terminal

    eight amino acids of the Mecp2 protein [25]. Since the earliest models, there have been several

    improvements to allow for selective activation of the Mecp2 protein at specific time points in the

    development of the mouse using Tamoxifen, which is an estrogen receptor antagonist [20]. By

    selectively reactivating the Mecp2 function, it is possible to assess if the effects of Mecp2-

    deficiency during developmental states has irreversible detrimental effects [24, 26]. Furthermore,

    this opens the possibility to studying the pre- and post- gene reactivation effects to determine

    biomarkers of epileptiform activity.

    1.1.3 Mechanisms of Mecp2 Deficiency

    A common phenotype in mouse models lacking Mecp2 is cortical network hyperexcitability [27,

    28]. Network hyperexcitability has been associated with cortical spike and wave discharges that

  • 1.2 Electrical Rhythms in the Brain 5

    have been identified in EEG recordings [21, 26, 29]. Hyperexcitabilty resulting from Mecp2

    deficiency appears to be specific to brain regions. Cortical network hyperexcitability has been

    observed in mice globally lacking Mecp2, which was not seen in mice selectively lacking Mecp2

    from HOX-B1 hind brain neurons [30]. Hyperexcitability has also been observed in mice lacking

    Mecp2 in the forebrain inhibitory neurons, which was not seen in mice selectively lacking Mecp2

    in extracortical forebrain neurons [31]. Together these results suggest that specific GABAergic

    circuits may be responsible for network hyperexcitability in Mecp2-deficient mice [32].

    The mechanism involved in network excitability of Mecp2-deficient mice is thought to result from

    enhancement of GABAB receptors due to diminished cortical expression of GABA transporter

    GAT-1. [32]. Through enhancement of GABAB receptor activity using GS-39783 (2.5 mg/kg), a

    GABAB receptor-positive allosteric modulator, Zhang et al., showed significantly elevated

    discharge incidence rates for Mecp2-deficient mice. At higher dosages, GS-39783 (5.0 mg/kg) was

    even able to induce discharge events in female wild-type mice. Administration of NO-711 (1.0

    mg/kg), a GAT-1 protein inhibitor, in female Mecp2-deficient mice, also resulted in a significant

    increase in discharge incidence rate. Furthermore, administering an increased dosage of NO-711

    (2.5 mg/kg) in female wild-type mice was also shown to induce discharge events [32]. Previous

    studies have shown that impairment of GAT-1 enzymatic activity is linked to the genesis of

    discharge events in cortical circuits [33]. Selective stimulation of extra-synaptic GABAA receptor

    activity using THIP, a selective agonist for GABAA receptor delta-subunit, resulted in a decrease

    of excitatory discharge patterns. Together these findings support the link between the reduction in

    GAT-1 and the network hyperexcitability observed in Mecp2-deficient mice.

    1.2 Electrical Rhythms in the Brain

    Characterization of epilepsy is a vital step in determining an appropriate treatment. The best way

    to achieve this characterization is to measure the electrical activity in an epileptic brain and

    compare it with the normal, or expected activity in a healthy brain. Through this comparison it is

    possible to obtain biomarkers of the pathology.

  • 1.2 Electrical Rhythms in the Brain 6

    The electrical signals measured in the brain, referred to as electroencephalograms (EEGs), provide

    a measure of the mean electrical activity between two points in the brain, one point being the point

    of interest and the other being a reference.

    EEG recordings can be obtained from outside the head using scalp EEG, or invasively from inside

    the head using implantable electrodes, referred to as intracranial EEG (iEEG). The non-invasive

    scalp EEG is placed on the head in a 10-20 system evenly distributed along the surface. Due to

    skull, hair and distance from the brain, the signals obtained this way are prone to artifacts.

    Intracranial EEG overcomes these limitations, though at the price of surgical implantation and the

    risks associated with it. Intracranial recordings can be achieved using a flat strip of electrodes

    placed on the surface of the cortex, or using electrodes inserted deeper inside the brain. While

    iEEG is limited to a smaller spatial region, these recordings provide a more detailed representation

    of the underlying electrical activity, and allow access to deeper brain structures whose activity may

    not be detected by scalp EEG [34].

    Since the first recordings conducted by Berger in 1924, there have been numerous discoveries

    facilitated by EEGs. One of the most notable discoveries are the brain rhythms - electrical

    oscillations found in different frequency bands. These rhythms can be broadly categorized as low

    frequency oscillations (LFOs) and high frequency oscillations (HFOs). Their presence and

    interactions have been shown to play vital roles in normal functioning brains, revealing insights

    into epilepsy and other neurological pathologies [35].

    1.2.1 Low Frequency Oscillations

    EEGs recordings from the brain are in the microvolts range, which leads to numerous technological

    challenges. The primary challenge is the elimination of artifacts. To avoid artifacts, the initial

    acquisitions were limited to LFOs which are defined to be less than 30 Hz [36]. Over the years

    LFOs have been clinically categorized into four frequency ranges [35]:

    0.5-4 Hz – Delta band

  • 1.2 Electrical Rhythms in the Brain 7

    4-8 Hz – Theta band

    8-12 Hz – Alpha band

    12-30 Hz – Beta band

    Normal theta band activity is further classified into two types: hippocampal theta rhythms and

    cortical theta rhythms. Hippocampal theta rhythms originate in the hippocampus. Due to the

    strong link between the hippocampus and motor behavior, hippocampal theta rhythms are thought

    to be linked to sensorimotor processing and mechanisms of learning and memory [37]. Cortical

    theta rhythms are frequently observed in young children and can also show up during meditation,

    drowsiness and some of the early stages of sleep [38]. Theta rhythms can also show up in cases of

    pathology. For example, in epilepsy the theta rhythm is the predominant rhythm found during

    epileptic seizures [35]. This abnormal theta activity observed during epileptic seizures is usually

    characterized by large amplitude spikes.

    Normal delta rhythms have been found to identify deep stages of REM, particularly stages 3 and

    4 [38]. Delta is related to physical awareness and consciousness, and can be induced through deep

    meditation. Studies have found that cognitive tasks may elicit delta network activity. Delta rhythms

    have also been associated with motivation, and have been found to increase in power during

    periods of attention or pain.

    During wakefulness, delta activity is often overpowered by activity in higher frequency bands, but

    becomes dominant when these higher frequencies are pathological in nature. Sleep studies have

    revealed that epileptiform activity is influenced by the presence of the delta band, with some

    studies showing that the rate of occurrence and characteristics of seizures change during the

    presence of high delta rhythms [39]. In the absence of spiking activity, the presence of delta slow

    waves was more prevalent with uncontrolled seizures [40]. Slowing of interictal delta activity has

    been observed in patients with temporal lobe epilepsy [41] and has been linked to positive surgical

    outcomes suggesting that delta could play a role as a biomarker for surgical treatment [42].

  • 1.2 Electrical Rhythms in the Brain 8

    Theta and delta LFO frequency bands have been extensively studied in relation to epilepsy. LFOs

    by their nature can traverse further spatially and have a greater effect on the communication

    between various regions of the brain [43], potentially maintaining the seizure episode once it

    begins.

    1.2.2 High Frequency Oscillations

    With the arrival of modern digital recording in the 1980s, it became possible to record frequencies

    greater than 80 Hz, which are referred to as HFOs. HFOs have been found in both animal [44, 45]

    and human studies [45-50] to be present during normal and pathological activities in the brain.

    Under normal brain function, HFOs have been found in various brain structures, though

    particularly in the motor cortex during finger movements [51], and in the temporal lobe during

    working memory tasks [52]. HFOs have also been linked to numerous tasks including language,

    attention, visual and auditory tasks. It has been hypothesized that HFOs play a role in cortical

    processing, rather than acting as a marker for any given brain region [53]. HFOs are typically

    classified into two frequency bands: ripples (80 – 250 Hz) and fast ripples (> 250 Hz) [54].

    The most notable feature separating the ripples and fast ripples is their spatial origin. It has been

    shown that ripples generated from hippocampus or parahippocampal structures are often

    associated with normal physiological activity, while those associated with the dentate gyrus are

    often considered to be pathological in nature. Buzsaki et. al, provide evidence that ripples play a

    role in memory consolidation. In rodent experiments where ripples were selectively suppressed

    after learning using timed electric stimulation, large impairments in daily performance were

    observed, that were comparable to surgically damaging the hippocampus [55]. It has been

    postulated that ripples constitute a replay of important events, so that they can be encoded in a

    more permanent manner [56]. Furthermore, Grenier et al., found ripples to have a preferential

    presence during slow-wave sleep, during which time the brain is disconnected from the outside

    world, and their link to strong neuronal activity suggest that ripples may be involved in plasticity

  • 1.2 Electrical Rhythms in the Brain 9

    processes [57]. It has been suggested that ripples result from inhibitory field potentials which may

    be involved in strong coherence of long-range neuronal activity [44, 58]. Studies on epilepsy have

    found that ripples accompanied by continuous/semi-continuous background EEG activity are more

    prevalent in the hippocampus and occipital lobe, but show no correlation to the seizure onset or

    lesion sites [59], suggesting a physiological neuronal activity rather than pathological. Fast ripples

    recorded in both rodent and human somatosensory cortex and are believed to reflect the

    coordinated discharge of neurons involved with the processing of incoming sensory information

    [60-62]. In general, fast ripples are believed to be pathological in nature and have been suggested

    to be the result of the strong coherence of abnormally bursting neurons [63]. Some have stipulated

    that fast ripples are harmonics of the ripples [64]; however, the difference in the neuronal activity

    involved and the different spatial origins suggest otherwise [65].

    In recent years, HFOs, and in particular fast ripples, have gained prominence as biomarkers well

    suited for the identification of seizure pathology [49]. HFOs have been recorded as preceding

    seizure onset in epileptic patients [66]. Additionally, HFOs have been shown to provide diagnostic

    value. In particular, fast and abnormal HFOs in the 150-500 Hz frequency range have been

    recorded during interictal periods from the hippocampus and entorhinal cortex of patients with

    mesial temporal lobe epilepsy [62, 63]. HFOs, unlike the delta and theta LFOs, have been shown

    to be spatially confined to narrow regions and permit reliable localization of seizure onset zone

    (SOZ) [67, 68] by measuring increases in power [69, 70]. Finding the SOZ may allow for focused

    intervention, whether through resection surgery or some form of electrical neuromodulation [71-

    73].

    The clinical implications of finding the seizure focus have made HFOs a prominent topic of

    research over the past decade. Recent findings have shown that resection of tissue associated to

    HFOs are linked to positive surgical outcomes in both adults [74] and children [69, 75].

    Retrospective correlations of resection of regions with HFO activity with post-surgical outcomes

    have been performed in a number of studies [74-76]. Independent studies performed by Jacobs et

    al. and Wu et al., identified that patients with a positive postsurgical outcome had a larger portion

    of HFO-associated tissue resected compared to patients with a poor post-surgical outcome. Fast

    ripple oscillations were detected retrospectively in 80% of patients and in cases where complete

  • 1.2 Electrical Rhythms in the Brain 10

    resection of tissue containing fast ripple activity was achieved, the resection was shown to correlate

    with seizure freedom [75].

    Furthermore, many studies have tried to used HFO threshold rates as a way to distinguish

    epileptogenic regions, but this can be problematic, since the thresholds can be affected by sleep

    [62, 77] or medication [78], and often show signs of being dependent on brain region and patient

    specific. In addition to surgical correlations, HFOs have also been linked to therapeutic

    interventions using ACDs. After medication is withdrawn prior to assessment of surgical

    candidacy, increases in the number of HFOs have been observed [78], suggesting that iEEG-based

    HFO biomarkers can track changes due to ACDs. Differentiating physiological HFOs from

    pathological HFOs is not straightforward, as they share similar and overlapping frequencies [65].

    It is not completely clear which characteristics of the HFO activity are clinically relevant; hence,

    additional analysis techniques, such as the interactions of LFOs and HFOs, are warranted.

    1.2.3 LFO-HFO Cross-Frequency Coupling

    Multitudes of oscillations spanning LFO and HFO frequency ranges have been identified

    throughout the many regions of the brain. LFOs and HFOs, both on their own and through their

    interaction, have been shown to characterize brain state activity. This interaction of rhythms,

    referred to as cross-frequency coupling (CFC), is best understood in studies on the hippocampus

    where neural coding in the form of nested oscillations has been observed to facilitate short-term

    memory storage [79]. Coupled oscillations are ultimately suggested to be a timing mechanism for

    the serial processing of short-term memories. Lisman et al., [80] argues that theta and gamma

    activity are part of a common functional system and they form a theta-gamma code that allows the

    hippocampus to process and recall long-term memories. It has been suggested that each memory

    is stored in a different 40 Hz sub-cycle of a particular LFO, that is the amplitude of a higher

    frequency signal tends to show a preference for a particular phase of the lower frequency signal

    [81-84]. Studies on rodents show that CFCs coincides to locations in a maze, with different cells

    firing as a rodent traverses a maze, hence the term “place cells”. The systematic progression of the

  • 1.2 Electrical Rhythms in the Brain 11

    phase as the spatial location changes is referred to as the phase precession. These studies on place

    cells suggest phase-amplitude cross-frequency interactions are important for neural coding [80].

    The ability of neural rhythms to carry out physiological functions, as seen with the theta-gamma

    neural code [85, 86], also suggests that disruptions in the neural codes can lead to pathology.

    Disruptions in neural code are best seen in studies on epilepsy, which show that interactions of

    rhythms or neuronal code can be correlated with surgical outcome [87]. Melanie et al., showed

    that surgical resection of regions showing fast ripples and ripples coexisting with flat background

    EEG activity would lead to seizure freedom, whereas resection of areas generating ripples with a

    continuously oscillating background EEG pattern did not result in favourable surgical outcomes

    [87]. Phase – amplitude CFC of ripples with delta rhythms have been documented during ictal

    states [88]. Guirgis et al. highlights that HFO presence is mainly linked to pathology when other

    rhythms are present, specifically the delta (< 4 Hz) rhythm. Their findings reveal that delta-HFO

    coupling emerges at seizure onset and termination, implying that the delta-HFO coupling may be

    involved in transition mechanisms leading to epileptogenesis [89]. Even so, these disruptions in

    cross-frequency coupling can provide insight into pathology and can be used as biomarkers to

    better characterize and treat neurological disorders.

    The greatest obstacle preventing a more widespread clinical use of HFO-based diagnosis

    techniques are difficulties associated with recording the HFOs [90] and the sheer scale of the

    recordings. Intracranial recordings are invasive and are only used as a last resort when all other

    forms of intervention fail. Further, from the Nyquist criterion, HFOs require sampling in the

    kilohertz range. Due to their spatial focus, recording HFOs then requires many spatially distributed

    electrodes which produce massive high dimensional data sets which are difficult to process, calling

    for the use of automated techniques of signal processing and machine learning.

  • 1.3 Prediction of Treatment Outcome 12

    1.3 Prediction of Treatment Outcome

    Univariate analysis of electroencephalogram (EEG) activity has been used to differentiate between

    healthy individuals and patients suffering from a wide range of neurological disorders. These

    findings are usually significant at a group level and have had limited clinical translation. This

    should be of no surprise as the human brain is a highly complex system, with no two brains wired

    exactly the same. The goal of early diagnosis, planning of treatment and tracking of disease

    progression is not easily achieved.

    Conventional research is typically applied at a group level focusing on commonalities across

    groups of people and often ignoring differences as outliers. The group approach often ignores the

    nonlinearity of the individual response. In contrast, doctors working with patients have to make

    clinical decisions about individuals by taking into consideration medical background, blood work,

    and other available data in order to make an objective inference at the level of an individual rather

    than the group.

    There has been a growing interest in the use of analytical methods to allow for inference at the

    individual level. One way of achieving it is through the use of supervised machine learning

    predictive models such as support vector machines (SVMs). The goal of supervised learning is to

    develop a discriminative function that can make classifications or predictions from a set of

    features. Advantages of this method are: (1) it allows for multivariate characterization at the level

    of the individual, potentially leading to results with higher clinical translation; (2) supervised

    machine learning techniques can pick up on subtle, otherwise undetectable nonlinear differences

    in the brain.

    The three main clinical focus areas employing inference include diagnosing, predicting disease

    onset and, more recently, predicting treatment outcome. There have been many studies on

    diagnosing disorders [91, 92] and predicting disease onset in advance [93], and only a few studies

    have focused on predicting treatment response in a machine learning context [94, 95]. Of the

    studies performed on predicting treatment outcomes, the vast majority focused on individuals

    suffering from major depression , and some work done on schizophrenia [94, 96]. Applying

  • 1.3 Prediction of Treatment Outcome 13

    machine learning techniques in the study of major depression is clinically impactful, as individuals

    respond differently to treatments. For example, in major depression a third of the patients show no

    improvements to antidepressant treatment [97].

    Studies using structural magnetic resonance imaging (MRI) data in conjunction with SVMs to

    predict treatment outcome in major depression were able to link treatment outcome to grey matter

    presence with an accuracy of 88.9% (88.9% specificity and 88.9% sensitivity) [95]. In a follow up

    study applied on white matter and grey matter and which included a larger sample size, features

    were able to correctly predict treatment outcome 12 weeks in advance with an accuracy of 69.57%

    and 65.22% respectively [98]. A similar study was applied on chronic schizophrenia, this time

    using pretreatment EEG data to predict the response to clozapine [94]. In that study they were able

    to achieve an accuracy of 85% on 23 subjects, providing support for predicting treatment outcome

    from brain activity. In a study on schizophrenia, it was shown that scalp EEG features examining

    combinations of power, and coherence measures are effective at predicting favorable treatment

    response to antidepressant drug therapy [96].

    The topic of treatment outcome prediction is of great importance for epilepsy applications, as

    individuals respond differently to treatments. For example, roughly 30-40% of patients do not

    respond to anticonvulsive drugs, and of those that do respond, not all respond in the same way.

    Finding the most effective treatment can be an ordeal for patients, often resulting in unnecessary

    complications. Anticonvulsive drugs can make the seizures worse and more frequent, and are

    associated with numerous side-effects that can affect cognition and patients’ abilities to perform

    [99-101]. Furthermore, patients can build up a tolerance to certain drugs over time [102], at which

    point the trial-and-error search for a treatment resumes. It would go a long way towards improving

    quality of life for patients if treatments could be tailored to them individually. While machine

    learning techniques have been applied in epilepsy studies to classify seizure events [103], predict

    the seizure onset [104], and most recently, identify seizure onset zone [105] , machine learning

    methods that predict treatment outcome have not, to the best of our knowledge, been applied in

    epilepsy research until now.

  • 1.4 Objectives and Hypothesis 14

    Unsuccessful drug trials and delayed treatments have a high impact on quality of life and are

    expensive for both patients and the health care system. Determining a priori the most effective

    treatment would help improve the lives of patients and reduce the financial burden associated with

    anticonvulsive drug treatments. The first step for developing a successful treatment plan is

    accurately diagnosing or characterizing the disorder. Our brains are not well suited for multivariate

    correlations, but fortunately machine learning techniques have been designed specifically with this

    purpose in mind.

    1.4 Objectives and Hypothesis

    1.4.1 Objectives

    This thesis studies the use of iEEG-based features to track LFO-HFO interactions of cross-

    frequency coupling (CFC) in a Mecp2-deficient mouse undertaking ACD therapy, to determine

    treatment efficacy. Two commonly used machine learning techniques, Support Vector Machines

    (SVMs) and Random Forests (RF), are trained and evaluated on time-based, normalized power

    and cross-frequency coupling features to predict the likelihood of treatment outcome for

    commonly used ACDs.

    The main contributions of this thesis are as follows:

    i. Detection and identification of epileptiform discharges in genetic model of epilepsy using

    intracranial EEG recordings. Automated detection of discharge events is obtained from

    several hours of time-series recordings in the Mecp2-deficient mouse model of Rett

    Syndrome.

  • 1.4 Objectives and Hypothesis 15

    ii. Characterization of discharge durations over developmental time points. Not all

    discharges are the same. Analyses of discharge durations show that long durations duration

    discharges are better suited for tracking pathology.

    iii. Identification and tracking of epileptiform activity over developmental time points.

    Developing a successful treatment plan is vital in accurately diagnosing or characterizing

    epileptiform discharges. LFO and HFO features of cross-frequency coupling obtained from

    iEEG are used to track biomarkers of epileptiform pathology over development.

    iv. Examine heterogeneity in drug treatment outcomes. Variability of ACD treatment outcome

    across subjects is observed, suggesting that a patient-specific approach is necessary for

    properly selecting the most appropriate ACD treatment.

    v. Evaluate machine learning techniques for ranking the efficacy of drug treatment options.

    SVM and RF machine learning methodologies are used to estimate likelihood scores for

    successful treatment outcome based on time, normalized power, and cross-frequency

    coupling features. In particular, features of cross-frequency coupling yield the most

    effective a priori identification of appropriate ACD treatment.

    This work has been published [26, 29, 71, 105-111].

    1.4.2 Hypothesis

    The hypotheses of this thesis are as follows:

    H1: iEEG-based measures of delta – HFO phase-amplitude CFC in long duration discharges

    can differentiate between epileptiform and non-epiletiform discharges, and can provide

    patient-specific biomarkers of epileptiform activity.

  • 1.5 Outline of Chapters 16

    H2: A priori features of LFO-HFO CFC in conjunction with machine learning techniques

    can accurately predict drug treatment outcome in a Mecp2-deficient mouse model of

    Rett syndrome.

    1.5 Outline of Chapters

    The thesis is organized as follows:

    Chapter 1 introduces the background and motivation for the study of electrical rhythms, and their

    role in the diagnosis and prediction of epileptiform activity.

    Chapter 2 provides a review of the methodology pertaining to the experimental protocols and

    time-frequency analyses and machine learning techniques applied in this thesis.

    Chapter 3 identifies and tracks the spatiotemporal progression of epileptiform activity in the low

    frequency range.

    Chapter 4 characterizes and tracks the progression and interaction of low and high frequency

    range epileptiform discharges.

    Chapter 5 presents a patient-specific protocol using machine learning techniques to predict

    treatment outcome to various anticonvulsive drugs.

    Chapter 6 discusses the significance and future directions of the results presented in this thesis.

  • 17

    Chapter 2

    Methodology

    The work presented here is subdivided into three sections: experimental setup, signal processing,

    and machine learning. Where machine learning dealt with feature extraction, labeling, training and

    evaluation. The recordings of iEEG were obtained from Mecp2+/- mice exhibiting spontaneous and

    recurrent epileptiform discharges. These recordings came from pre- and post- mecp2 gene

    reactivation, and pre- and post- drug treatments (Figure 1a).

    Signal processing techniques of normalized time-frequency power, rhythm extraction, and

    measures of cross-frequency coupling were applied to obtain features of pathology (Figure 1b).

    Signal processing techniques consist of the continuous wavelet transform (CWT), ensemble

    empirical mode decomposition (EEMD), and index of cross-frequency coupling (ICFC). In addition

    to visualizing the time-frequency power distribution, CWT was also used to obtain the phase and

    amplitude signals used to compute the comodulogram which provides a measure of cross-

    frequency coupling. A focused measure of cross-frequency coupling was obtained by using

    EEMD. EEMD extracted Low Frequency Oscillations (LFOs) and High Frequency Oscillations

    (HFOs) avoiding the need to select specific frequency bands a priori. In addition to visualizing the

    data and determining the treatment efficacy, signal processing techniques were also used to obtain

    features for the machine learning prediction algorithm.

  • 1.5 Outline of Chapters 18

    Figure 1. Overview of experimental design, signal processing and machine learning methods.

    (A) Experimental setup consisted of pre- and post- gene reactivation along with drug testing. (B) Signal processing

    overview highlights the rhythm extraction using complex CWT and extraction of specific rhythms of interest using

    EEMD for purposes of generating labels. (C) Machine learning overview highlighting the selection of machine

    learning algorithms using t-SNE projections. Figure taken from [111].

  • 2.1 Experimental Setup 19

    Machine learning algorithms were evaluated on their ability to predict treatment outcome from the

    iEEG-based features. Figure 1c provides an outline of the machine learning methodology used in

    this study. Samples for training and evaluation consisted of epileptiform discharges obtained from

    six mice pre and post treatment. The subjects were examined for discharges using predefined

    selection criterion [26, 107]. Post-treatment recordings of long duration discharges were evaluated

    for presence of significant delta – fast ripple cross-frequency coupling. A percentage score was

    used to evaluate the efficacy of each ACD treatment. Treatments outcomes resulting in low

    percentage of delta-HFO cross-frequency coupling were labeled as positive responders, whereas

    those with high percentages were labeled as negative responders or non-responders. Two

    commonly used machine learning algorithms: SVMs and RFs were trained on EEMD time-based,

    normalized power and comodulogram features to predict the likelihood of successful treatment.

    The accuracy of these predictions was compared using Receiver Operating Characteristic (ROC)

    plots, and by evaluating the percentage of samples that predicted successful treatment outcomes.

    2.1 Experimental Setup

    All animal experimental procedures were approved by local ethics committees in accordance to

    guidelines of the Canadian Council on Animal Care.

    2.1.1 Animal Subjects

    Experimental genotypes were produced by crossing female Mecp2+/- mice (Mecp2tm1.1Bird,

    Jackson Laboratory, Bar Harbor, ME, USA) with male wild-type mice described previously [112,

    113]. In total there are (n=6) female Mecp2+/- subjects. In the case of the mecp2 gene reactivated

    model, the mice were generated by crossing female Mecp2+/- mice with male Rosa26-Esr/Cre

    transgenic mice (Gt(ROSA)26Sortml(cre/ESR1)Tyj/J , Jackson Laboratories) [29, 113]. In total

    there are (n=4) female mecp2 reactivated mice (rescue mice). Mice were subjected to experimental

  • 2.1 Experimental Setup 20

    procedures only once they displayed clear Rett-like behavioural traits and were studied between 3

    and 23 months of age. All subjects were maintained on a pure C57Bl/6J background and housed

    in a vivarium that was maintained at 22-23oC with a standard 12-hour light cycle commencing at

    06:00.

    2.1.2 Implantation Surgery

    Animals were implanted with stainless steel, polyimide-insulated stainless steel electrodes (125

    μm) following procedures described previously [21, 114]. Preconfigured microelectrodes were

    implanted in the somatosensory cortex (Bregma, -0.8 mm; lateral, 1.8 mm; depth, 1.5 mm) with a

    reference implanted superficially in the frontal brain area (Bregma +2.8mm; lateral, 1.8 mm; depth,

    0.5 mm)(Figure 2b). Animals were allowed at least 7 days of post-surgical recovery time before

    further experimentation. iEEG signals were amplified 1000x, bandpass-filtered (0.01 – 1000 Hz)

    and digitized (Digidata 1300, Axon Instruments, Weatherford, TX, USA). Data were sampled at

    60 kHz and stored using Clampfit 10.2 software (Axon Instruments). Recordings sessions lasted

    from 30 min to 1 h to observe all of the behavior states.

    2.1.3 iEEG Recording and Analysis

    Recordings of iEEG were obtained from Mecp2+/- mice exhibiting spontaneous and recurrent

    epileptiform discharges centered at roughly 6 – 10 Hz frequency range as described previously

    [21, 115]. These recordings came from pre- and post- mecp2 gene reactivation, and pre- and post-

    established and experimental anti-epileptic drug (ACD) treatment regiments (Figure 2). iEEG

    signals were amplified 1000x, bandpass-filtered (0.01 – 1000 Hz) and digitized (Digidata 1300,

    Axon Instruments, Weatherford, TX, USA). Data were sampled at 60 kHz and stored using

    Clampfit 10.2 software (Axon Instruments). Recordings sessions lasted from 30 min to 1 hour to

    observe all of the behavior states. Discharge events were identified in the recordings using an

  • 2.1 Experimental Setup 21

    automated detection technique (see 2.2.1) based on predefined selection criterion as described

    previously [26, 107]. Data preprocessing consisted of down sampling to 4k Hz followed by

    removal of 60 Hz line noise using a high order FIR notch filter with +/- 0.5 Hz cutoffs. Segments

    with large amplitude muscle artifacts were excluded from analysis.

    2.1.4 Established and Experimental Anticonvulsive Drug Treatments

    Discharges observed under Mecp2 deficiency have been linked to cortical network

    hyperexcitability induced by alterations in GABAergic neurons which in turn affects GABA-

    related signaling [32]. Pharmacological interventions were obtained from Food and Drug

    Administration (FDA) and experimental ACD sources and were based primarily on their

    effectiveness on GABAergic circuits. The drugs and dosages used in this study were: Midazolam

    at 0.5 mg/kg, Ganaxolone at 5 mg/kg, 4,5,6,7-tetrahydroisoxazolo[5,4-c]pyridine-3-ol (THIP) at 5

    mg/kg, and Phenytoin at 30mg/kg. Phenytoin is an FDA approved ACDs used mainly for long-

    term control of seizures. Phenytoin is believed to protect against seizures by causing a voltage-

    dependent block of sodium channels, which prevents sustained high frequency firing of action

    potentials [116]. Phenytoin has also been linked to synaptosomal transport of glutamate and

    GABA which may explain their effectiveness in Mecp2-deficient mice [117]. Midazolam is fast

    acting benzodiazepine typically reserved for emergency control of acute seizures, including status

    epilepticus [118]. Midazolam acts on GABAA receptors, however, it does not directly activate

    GABAA receptors, rather acts as an intermediary, like other benzodiazepines, to enhance the effect

    of the GABA neurotransmitter [119]. Ganaxolone was selected for anticonvulsive effects arising

    from its GABAA modulatory effects. Ganaxolone has been shown to bind to the GABAA receptor

    to modulate and open chloride ion channels causing an inhibitory effect on neurotransmission and

    reducing the chance of action potentials [120]. THIP which acts as a selective agonist for GABAA

    receptor subunit delta was selected [121] and has previously been shown to have anticonvulsive

    properties [32]. All drugs were dissolved in double distilled H20 and administered intraperitoneally

    to the animals. All pharmacological treatments were applied for a period of one day followed by a

    day of washout before administering any other pharmacological compound.

  • 2.1 Experimental Setup 22

    Figure 2. Mecp2-deficient mouse model of Rett Syndrome.

    (A) Close-up of implanted stainless steel, polyimide-insulated stainless steel electrodes highlighting the arrangement.

    The reference electrode was implanted superficially in the frontal cortex. The analyses presented here focused on the

    recordings from the somatosensory cortex. (B) Representative time-series recording of an epileptiform discharge

    observed in the Mecp2-deficient mouse model.

  • 2.2 Signal Processing 23

    2.2 Signal Processing

    2.2.1 Automated Discharge Detection

    Time-series iEEG traces were visually inspected by condition blinded investigator to confirm and

    quantify the presence of discharge activity as previously described [26, 122, 123]. A discharge

    event was defined as having durations of at least 0.4 seconds, an amplitude of at least 1.5-fold

    background, and a frequency in the theta 6-10 Hz frequency range [26].

    Using the established manual criteria [26, 122, 123], we developed an automated method for

    detecting epileptiform discharge events and recording the time of occurrence and duration. The

    method is similar to that outlined in [26, 124], the main exception being the additional amplitude

    criterion for defining a discharge event. The first step of the automation is the application of a 6–

    10 Hz FIR band pass filter to isolate the frequency band associated with the discharges. This

    frequency band was chosen as it limited the effect of high delta power, while at the same time

    capturing the increased theta power present during a discharge (Figure 3). The envelope of the

    filtered signal was created by convolving a Gaussian kernel of 200-point aperture with the square

    of the filtered data. The envelope peaks at the presence of strong 6–10 Hz power. Normal cortical

    LFP signals within this frequency range rarely display high-amplitude rhythmic spiking (Figure

    3a). As a result, the envelop peak reflects the presence of a discharge event (Figure 3b). We then

    determined envelope thresholds specific to each subject with the emphasis of minimizing false

    positives. Discharge durations were determined by finding the left and right inflection points of

    detected events, which indicated the start and end points respectively of a discharge. The inflection

    points were computed by convolving the envelope with the derivative of the Gaussian kernel

    resulting in the rate of change plot shown in Figure 3c. Discharges less than 0.4 seconds were not

    considered. The average areas under the curve 0.5 seconds to the left and right were compared to

    the average area under the discharge. If the average area under the curve of the discharge was at

    least twice that of the sides, then the discharge was accepted.

  • 2.2 Signal Processing 24

    Figure 3. Automated epileptiform discharge detection applied on a 10 second iEEG segment.

    (A) Sample in vivo iEEG recording with two discharge events shown. Red vertical lines indicate the boundary

    separating discharge and inter-discharge regions. (B) Filtered 6-10Hz signal with the envelope (green) tracking the

    power of the theta band which was used to distinguish between discharge and inter-discharge regions. (C) The

    envelope rate of change demonstrating how the inflection points were used to identify the beginning and end of a

    discharge.

  • 2.2 Signal Processing 25

    2.2.2 Delta Power

    Regions of low and high delta frequency power (0.5-4 Hz) as defined by [125] were determined.

    The first step involved removing the 0.25 seconds time period preceding and succeeding artifact

    events; where artifacts were characterized by voltages greater than 0.5 Volts. Then, a FIR band

    pass filter with an order of 1000 was applied to isolate the 0.5-4 Hz delta frequency band. The

    delta band signal was squared and averaged over 30 second intervals to obtain the delta power. To

    discern the daily patterning of the delta signals, we normalized the delta power signals to a mean

    of zero and variance of one. A Gaussian-based kernel similar to that defined for the automated

    discharge detection; but with a 50-point aperture was applied on the filtered signal, generating an

    envelope that represents the delta power. In order to discretize the signal, we used a zero threshold

    to discern the difference between the high and low delta frequency power states.

    2.2.3 Time-Frequency Analysis

    Time-frequency power distributions were obtained by applying a continuous wavelet transform

    (CWT) on iEEG time-series recordings (Figure 4), where �̂� is a time interval centered on an

    epileptiform discharge. The CWT measures the correlation between a signal 𝑥(�̂�), and a wavelet

    basis, 𝜓, for different scales, s, and time shifts 𝜏 [126] and is defined as,

    𝑊(𝑠, 𝜏) = ∫ 𝑥(𝑡)𝜓𝑠,𝑡∗ (𝑡)𝑑𝑡

    �̂� (1.1)

    where,

    𝜓𝑠,𝑡∗ (𝑡) =

    1

    √𝑠𝜓𝑜 (

    𝑡−𝜏

    𝑠) (1.2)

    is the basis function with * denoting the complex conjugate. The basis function used here is the

    complex Morlet wavelet defined according to the following form,

    𝜓𝑜(𝑡) =1

    √2𝜋𝑒𝑥𝑝 (𝑖𝜔𝑐𝑡 −

    𝑡2

    2) (1.3)

  • 2.2 Signal Processing 26

    The scales were transformed to appropriate frequencies 𝑓, from the angular frequency 𝜔𝑐, using

    the relation 𝜔𝑐 = 2𝜋𝑓𝑠 = 5.1 𝑟𝑎𝑑/𝑠. The result of the CWT yielded complex valued coefficient

    matrix,

    𝑊(𝑓, 𝑡) = 𝑤(𝑓, 𝑡) + 𝑖�̃�(𝑓, 𝑡) (2.1)

    for which the magnitude was obtained as a measure of power over time and frequency. The

    frequencies ranged from 1 to 600 Hz by 1 Hz step size, and the times correspond to the duration

    of the recording 𝑥(�̂�).

    To visualize the high and low frequencies on the same scale, the time-frequency vectors were

    further z-score normalized according to the following,

    𝑊𝑛𝑜𝑟𝑚(𝑓, 𝑡) =|𝑊(𝑓,𝑡)|−𝜇(𝑓)|

    𝑡2𝑡1

    𝜎(𝑓)|𝑡2𝑡1

    (2.2)

    where variables 𝜇 and 𝜎 represent the mean and standard deviation from wavelet coefficient

    magnitudes for each corresponding frequency taken from a two second baseline segment �̂�

    obtained prior to seizure onset (i.e. �̂� ∈ [𝑡1, 𝑡2]), see Figure 4b,c.

    2.2.4 Comodulogram

    Cross-frequency coupling for a time interval �̂�, is proposed as a composite complex-values signal

    𝑆𝐶𝐹𝐶(�̂�) consisting of an amplitude time-series of one higher frequency 𝐴(�̂�, 𝑓𝐻) with a lower

    frequency of phase time-series 𝜑(�̂�, 𝑓𝐿) as shown (see Figure 5),

  • 2.2 Signal Processing 27

    Figure 4. Normalized time-frequency analysis of discharges observed during epileptiform events.

    (A) iEEG recording of discharge events shows HFO activity during the peak of the discharge. (B) Standard time-

    frequency analysis of discharges using CWTs does not reveal any HFOs due to 1

    𝑓 power scaling. (C) Z-score

    normalization of CWT coefficients using a baseline region prior to seizure onset corrects for the power scaling and

    highlights the presence of HFOs.

    𝑆𝐶𝐹𝐶(�̂�, 𝑓𝐻 , 𝑓𝐿) = 𝐴(�̂�, 𝑓𝐻)𝑒𝑗𝜑(�̂�,𝑓𝐿) (3.1)

    The time-series of the amplitude envelope 𝐴(�̂�, 𝑓𝐻) and instantaneous phase 𝜑(�̂�, 𝑓𝐿) were

    determined from the respective complex wavelet coefficients,

  • 2.2 Signal Processing 28

    Figure 5. Step-by-step computation of the ICFC measure for a particular HFO-LFO frequency pair.

    (A) Obtain iEEG time segment centered on an epileptiform discharge. (B) CWT-based time-frequency computation

    generates coefficients across frequencies 1 to 600 Hz with a 1 Hz step size. (C) Extraction of an HFO and LFO signal

    from the CWT coefficients. (D) Computation of the envelope from the HFO signal and phase for the LFO. (E) The

    envelope amplitude values are binned in 20 degree phase intervals to obtain a histogram. Amplitude-phase histograms

    that deviate from the uniform distribution lead to increased ICFC measurements.

  • 2.2 Signal Processing 29

    𝐴(�̂�, 𝑓𝐻) = |𝑤(�̂�, 𝑓𝐻) + 𝑗�̃�(�̂�, 𝑓𝐻)| (3.2)

    𝜑(�̂�, 𝑓𝐿) = 𝑎𝑟𝑐𝑡𝑎𝑛�̃�(�̂�,𝑓𝐿)

    𝑤(�̂�,𝑓𝐿) (3.3)

    The coupling of the amplitude of a higher frequency signal, 𝐴(�̂�, 𝑓𝐻) for a phase of a lower

    frequency signal 𝜑(�̂�, 𝑓𝐿) was assessed over a range of frequency pairs using the algorithm

    proposed by Tort et al. [127]. The 𝜑(�̂�, 𝑓𝐿) signal was segmented into 20 degree intervals resulting

    in 𝑁 = 18 bins. Within each bin the amplitude envelopes were averaged 〈𝐴(�̂�, 𝑓𝐻)〉. The mean

    amplitude was normalized by the sum over all mean amplitudes in each phase bin, according to,

    𝑝𝑗(�̂�, 𝑓𝐻 , 𝑓𝐿) =〈𝐴(�̂�,𝑓𝐻)〉𝑗

    ∑ 〈𝐴(�̂�,𝑓𝐻)〉𝑘𝑁𝑘=1

    (4.1)

    producing a probability density value jp , where j indicates the phase bin number which is

    associated with 𝑓𝐿. Then an entropy measure defined by:

    𝐻(�̂�, 𝑓𝐻, 𝑓𝐿) = − ∑ 𝑝𝑗(�̂�, 𝑓𝐻, 𝑓𝐿)𝑙𝑜𝑔(𝑝𝑗(�̂�, 𝑓𝐻 , 𝑓𝐿))𝑁𝑗=1 (4.2)

    was determined and normalized to obtain the index of cross-frequency coupling,

    𝐼𝐶𝐹𝐶(�̂�, 𝑓𝐻 , 𝑓𝐿) =𝐻𝑚𝑎𝑥−𝐻(�̂�,𝑓𝐻,𝑓𝐿)

    𝐻𝑚𝑎𝑥 (4.3)

  • 2.2 Signal Processing 30

    Where 𝐻𝑚𝑎𝑥 is the maximum possible entropy value, which for a uniform distribution has a value

    𝐻𝑚𝑎𝑥 = 𝑙𝑜𝑔 𝑁. The 𝐼𝐶𝐹𝐶 measure described by equation 4.3 yields the comodulogram when

    examined over a range of frequencies. The computation of the ICFC is summarized in Figure 5.

    2.2.5 Empirical Mode Decomposition

    The Ensemble Empirical Mode Decomposition (EEMD) was applied extract rhythms which were

    used with the time-based features, in addition to the computation of ICFC surrogates for the selected

    delta, theta LFO and fast ripple HFO rhythms [128]. EEMD is applied to decompose signals into

    rhythms and it does not require a priori knowledge of the frequency ranges of the rhythms, as

    would have been needed for bandpass filtering. Furthermore, EEMD is better suited for extracting

    rhythms with time-varying frequencies that span large frequency bands.

    The EEMD separates a signal into multiple rhythms referred to as intrinsic mode functions (IMFs).

    The decomposition as seen in Figure 6 is adaptive and dependent on local time characteristics of

    the data. The general method works by decomposing a time-series signal into nearly orthogonal

    components (IMFs) using a process called sifting. This methods can be summarized by two steps:

    i. The upper Xmax(t), and lower Xmin(t), envelopes of the signal are created by connecting

    a smooth spline to maxima and minima, respectively of a given signal X(t).

    ii. The mean of the two envelopes is then subtracted from the data to get a difference signal,

    𝑋1(𝑡) = 𝑋(𝑡) − 𝑋𝑚𝑎𝑥(𝑡) + 𝑋𝑚𝑖𝑛(𝑡)

    2 (5)

    This process is repeated by setting the new X(t) = X1(t) until a stopping condition is met.

    Generally, the stopping condition is met after a certain number of iterations are reached.

    The extracted IMF is subtracted from the original signal resulting in a residual signal R(t). Then a

    new IMF is found setting X(t) = R(t) and reapplying the sifting process. The search for new IMFs

  • 2.2 Signal Processing 31

    is complete once the amplitude of the residual signal is smaller than some predefined value. At

    this point the resulting set of IMFs can be summed together to reconstruct the original signal.

    EMD is not truly orthogonal and often neighbouring IMF may have overlapping components, an

    artefact known as mixing. In an attempt to compensate for the mixing that can occur, an extension

    to the EMD known as ensemble empirical mode decomposition (EEMD) has been made [128].

    EEMD works by iteratively computing the EMD on a signal with added Gaussian noise. After

    many iterations the noise cancels out and what is left is usually a well-defined signal with less

    mixing.

    Figure 6. Decomposition of time-series recordings using EEMD.

    (A) Sample time-series signal composed from two signals of different frequencies. (B) The mean signal enclosed by

    the upper and lower envelopes of the original signal is obtained. (C) The difference of the original and mean signals

    is used to obtain IMF. (D). The remaining signal, known as the residual, is obtained by subtracting the original signal

    by IMF1. This process continues until the residual signal equals zero or is sufficiently small as defined by a stopping

    threshold.

  • 2.3 Machine Learning Algorithms 32

    In our approach the noise variance σ2 was set to 0.2, and the number of iterations was set to 100,

    resulted in 9 IMFs. The IMFs spanned over a large range of frequencies with IMF1 representing

    the highest and IMF9 being the lowest frequency rhythm. The theta rhythm LFOs (6-10 Hz

    frequency range) were obtained from IMF 8, the delta LFOs (2-5 Hz frequency range) were

    obtained by combining IMF8 and IMF9 and the fast ripple HFOs (400 - 600 Hz frequency range)

    were obtained from the IMF2.

    2.3 Machine Learning Algorithms

    2.3.1 Support Vector Machines

    Support Vector Machines (SVMs) have been used extensively on classification problems on

    scientific data and have shown superior performance over traditional statistical and neural

    classifiers. SVMs have the ability to minimize both structural and empirical risk leading to better

    generalization for new data classification even with limited training dataset. Conceptually, the

    SVM algorithm determines a nonlinear boundary in the feature vector space by computing a

    maximum margin, linear boundary in the higher-dimensional space as shown in Figure 7. A

    transformation to the higher-dimensional space is achieved using kernels [129]. SVM work by

    applying a n-dimensional transformation to construct a hyperplane that maximizes the separation

    margin between input data classes [129]. SVM binary classification requires the computation of a

    maximum margin separation specified by the following quadratic programming (QP) problem:

    𝑓(𝒙, 𝜶, 𝑏) = {±1} = 𝑠𝑔𝑛 (∑ 𝛼𝑖𝑦𝑖𝑘(𝒙𝑖 , 𝒙) + 𝑏

    𝑙

    𝑖=1

    ) (6.1)

  • 2.3 Machine Learning Algorithms 33

    𝑚𝑎𝑥 𝑊(𝜶) = ∑ 𝛼𝑖

    𝑙

    𝑖=1

    −1

    2∑ ∑ 𝛼𝑖𝛼𝑗𝑦𝑖𝑦𝑗𝑘(𝒙𝑖 , 𝒙)

    𝑙

    𝑗=1

    𝑙

    𝑖=1

    (6.2)

    subject to,

    0 ≤ 𝛼𝑖 ≤ 𝐶 (6.3)

    and,

    ∑ 𝑦𝑖𝛼𝑖 = 0

    𝑙

    𝑖=1

    (6.4)

    The number of training samples is denoted by l, α is a vector of l variables, where each component

    𝛼𝑖 corresponds to a training point (𝑥𝑖, 𝑦𝑖), and C is the soft margin parameter which dictates the

    influence of outliers in the training data. The kernel 𝑘(𝑥𝑖, 𝑦𝑖) is a typically a nonlinear

    transformation of the input space to a higher dimensional feature space. For the purposes of this

    work the Gaussian kernel 𝑘(𝑥𝑖, 𝑦𝑖) = 𝑒𝑥𝑝(−𝛾‖𝑥 − 𝑥𝑖‖2) was used. The QP problem determines

    a vector α, where each element specifies the weight of each data. A select number of support

    vectors (SVs), which are data points closest to the boundary are used to represent the margin by

    having α values greater than zero. Since only a subset of the data, SVs, is required for representing

    the maximum separation boundary, that makes SVMs sell suited for small data sets with large

    feature vectors.

    SVMs with Radial Basis Function (RBF) kernels were implemented using the LibSVM library

    [130], and were trained to predict treatment outcomes for several ACDs. Training consisted of 5-

    fold cross-validation to select the best regularization parameter C and the RBF gamma parameter

    γ. The C parameter provides a tradeoff between misclassification of training examples versus the

    simplicity of the decision surface. A low C makes a smooth decision surface, while a high C places

    the aim at classifying all training examples correctly by providing freedom to the model to select

    more SVs. The gamma parameter defines the influence of a single training example, with a low

  • 2.3 Machine Learning Algorithms 34

    values meaning farther proximity of influence and high values having a closer proximity of

    influence. An exponential grid search over the parameters C (2-3, 2-4, … , 210) and γ (2-18 , 2-17 , …

    , 2-3) was then performed. The parameter pair yielding maximum performance on the training set

    was selected for testing.

    Figure 7. Description of SVM algorithm.

    A) SVMs determine a separating hyperplane between two classes by applying optimization techniques to find the

    margin with maximum separation. B) To facilitate the finding of the maximum separation, the initial input space

    data is transformed to a higher dimensional feature space where a linear separation of classes can be achieved using

    a hyperplane. This transformation is achieved with kernalization, and this higher dimensional hyperplane can be

    represented as a nonlinear separating margin in the lower dimensional input space.

  • 2.3 Machine Learning Algorithms 35

    2.3.2 Random Forest

    Random Forest (RF) classifier, proposed initially by Breiman [131], was used as one of the

    predictive models for ACD treatment outcome. The RF is an ensemble learning method for

    classification and other tasks. RFs work by constructing a multitude decision trees, where each

    tree independently trained to make classification decisions. The core of the random forest classifier

    is the binary decision tree, a data type that stores elements hierarchically in nodes (Figure 8). Each

    decisions tree is grown on different bootstrapped sample collections (i.e. randomly drawn

    instances with replacement form the original dataset) on a randomly selected subset of all available

    predictors. During training a subset of the trees are randomly selected and trained on the dataset.

    On testing, the trees perform majority voting to make the final class selection,

    𝑃(𝑐|𝐯) = ∑ 𝑃𝑡𝑇𝑡=1 (𝑐|𝐯) (7)

    where 𝑃(𝑐|𝐯) is the conditional probability of class label 𝑐 given input feature set 𝐯, and 𝑃𝑡 is

    the conditional probability for each randomly drawn tree.

    The advantage of the RF approach over other machine learning algorithms is that the random

    s