inferring wakefulness in narcoleptic patients during ... · the aim of this research is to explore...

Shegheva 1

Inferring Wakefulness in Narcoleptic Patients during Cognitive Tasks

Snejana Shegheva ([email protected])

The purpose of computing is insight, not the numbers.

Richard Hamming, 1962,

Numerical Methods for Scientists and Engineers

Abstract

The aim of this research is to explore an application of stochastic processes to the narcolepsy neurological disorder in order to optimize the individual’s learning style. A cognitive task such as writing demands attention spans often difficult to achieve for narcoleptics. In this study we investigate the possibility of probabilistic modeling of the wakefulness with the help of Hidden Markov Models.

A proposed approach involves creating a hybrid metacognitiveprobabilistic system ASLEPT which monitors the cognitive state 1

of individuals with narcolepsy in realtime through inference tasks by observing their typing patterns. Results of the model are processed by the digital assistant DAWA to drive insights about the temporal symptoms behavior unique to the individual. 2

Methodology developed in the current study is bridging areas of neuroscience, machine learning, probabilistic modeling, visualization and metacognition with the hope of having a high impact on the learning process for individuals with narcolepsy disorders and perhaps many others.

Keywords: narcolepsy, cognitive tasks, probabilistic models, inference, metacognition, visualization

Introduction

Narcolepsy is a neurological sleep disorder characterized by abnormalities of the sleepwake cycle caused

by hypocretin deficiency in the brain (Peyron, 2000). The main symptoms include excessive daytime sleepiness,

cataplexy, hypnotic hallucinations, sleep paralysis, automatic behavior and disrupted nighttime sleep (Berro et al,

2014). Although narcolepsy was first described in the 19 century (Bassetti et al, 1996), the exact pathophysiological

mechanism has not been fully uncovered which makes the diagnosis process imprecise and often delayed (Won et

al, 2014). The complexity of this disorder has gained more visibility in recent years in neuroscience communities

which has helped drive the research towards better understanding of the nature of the brain’s abnormalities

(Thannickal et al 2015).

1 ASLEPT — acronym for Awakefulness State Learning Evidenced by Pattern of Typing 2 DAWA — acronym for Digital Assistant for Writing Activities

Shegheva 2

Multiple investigations have been aimed at estimating the cognitive deficit in narcoleptic patients reporting

impairment in attention spans and executive control tasks (Naumann et al, 2006). It has been noted that narcoleptic

sufferers are more likely to report concentration and learning difficulties especially on nonroutine tasks (Smith et

al, 1992). A decreased ability to focus on more complex cognitive tasks has been largely attributed to the necessity

of continuous cognitive resource allocation to vigilance monitoring (Naumann et al, 2006).

All the research findings suggest a challenge in the learning process for narcoleptic individuals where it

requires maintaining a high level of concentration and focus. With this research we seek to answer the question if a

technology can provide adaptive mechanisms to improve the learning process for individuals with narcolepsy.

We hope to take the first steps in creating a personalized application with initial focus on improving the

experience during essay writing — via learning about and educating the individual on their changes in typing

patterns, and alter or adjust specific habits if necessary. The goal is to empower people with neurological disorders

to take control of their learning strategies by improving their understanding of their symptoms’ temporal behavior.

Obstacles in Essay Writing for Narcoleptic Patients

“ Writer's block is a condition ... in which an author loses the ability to produce new work” [Wikipedia,

retrieved 2015, August 27]. For individuals with narcolepsy this block is significantly harder to overcome where

inability “to produce new work” is generally prompted by symptoms attributed to this sleeping disorder. It is

challenging and nearly impossible for narcoleptic people to anticipate the exact timing of the next wave of

drowsiness and the irresistible urge to sleep. In an overwhelming number of cases, by the time an individual realizes

that he/she shifted from wakefulness to the first phase of REM, the motor activities have been drastically reduced

(Lopes et al, 2014). Such a cataplectic episode commonly causes the phenomenon observed in Fig 1.

The outcome of interrupted writing activity may vary from unfortunate and unrecoverable corruption of the

current document to a loss of ideas inflicted by frequent transitions between being alert and drowsy. While we

cannot hope to recover the lost ideas due to their intangible nature, we can minimize the chance of losing the results

of the efforts put so far.

Shegheva 3

Figure 1. Writing activity in the middle of a narcoleptic episode 3

(Image courtesy of Michael Maldonado’s personal experience)

Metacognitive Intervention

The goal of the research is to build an adaptive model trained on the narcoleptic individual’s unique pattern

of typing. This in turn would allow an accurate prediction of the incoming wave of sleepiness and in general

provide an insight on the symptoms’ behavior.

A continuous realtime monitoring and feedback are generalized by the metacognitive framework aiming to

help individuals recognise the symptoms beforehand. Metacognition is the art of thinking about thinking and as

such is often described as a means to learn about self strengths and weaknesses. Individuals can learn about

different approaches best suited to their own unique learning capabilities — metacognitive knowledge — and

monitor their performance under different circumstances in order to adjust strategies for optimal learning —

metacognitive regulation. A further subdivision of the metacognitive knowledge into three variables — person, task

and strategy (Flavel et al, 1979) — can be mapped to the cognitive task of writing essays. An individual with

narcolepsy (person variable) is aware of the general difficulties writing an essay (task variable) due to the disruptive

nature of the symptoms. What coupled with when, can a narcoleptic do something (strategy variable) in order to

accomplish the task? When we think in terms of these variables functioning independently we lack the ability to

strike in a timely fashion — there is no warning in advance of the incoming symptom in order to take any action.

At the intersection of the three variables as visualized in Fig.2 there is an opportunity for an educational

3 The image demonstrates a meaningless sequence of characters most likely caused by muscle weakness in narcoleptic patients during the cognitive task.

Shegheva 4

technology to help a narcoleptic monitor and regulate their learning process. An application which can accurately

model the wakefulness state with unique characteristics for each individual can provide a timely warning to avoid

corruption of the document and subsequent frustration.

Figure 2. Venn Diagram of the elements of metacognitive knowledge

Metacognition may take multiple forms and in recent years Bayesian theories have attracted more attention

in attempts to describe various cognitive processes (Lee et al, 2014). Bayesian principles of inferences have shown

success in modeling human cognition and it would be a good foundation for reasoning about symptom behavior of

neurological disorders such as narcolepsy.

Latent Model for Wakefulness Inference

Motivation for Probabilistic Programming

As mentioned in the above sections, narcolepsy is a complex phenomenon with almost unique

manifestation of symptoms per individual. Let us take a most common symptom — frequent daily sleep attacks

(EDS excessive daytime sleepiness) — and see if we can accurately describe the process starting with the

assumption of its deterministic nature.

If we know all the variables which cause EDS, then we should be able to exactly predict the next wave of

drowsiness. However, here lies the problem — there are too many variables, such as individual's age, the

profoundness of the disorder, type and amount of medication taken, time of day, general physical state of the

individual, etc. and etc.

This makes the problem intractable very quickly due to a complex interleaving of the intrinsic properties of

each variable. Their causal relationship however can be captured with stochastic models which have rapidly gained

popularity due to their power and systematic approach for modeling the real physical processes which carry some

amount of uncertainty (Remke et al, 2014). Thus, by expressing the transitions of wakefulness states through

Shegheva 5

probabilistic variables we have a better chance of extracting significant insights from the narcoleptic symptoms

behavior which in turn would allow a better control in the more complex and involved cognitive tasks.

Furthermore, by thinking about mathematical structures and algorithms in conjunction with the cognitive

processes of the mind we can develop sophisticated tools to better understand the complex human cognition as a

physical process (Chater et al, 2006).

Modeling Wakefulness with Directly Observable Variables

In the context of narcolepsy the real physical process which we are interested in is wakefulness

characterized as a signal which produces observations. We will start with the discrete representation of the signal 4

through the random variable — Y

Asleep, Drowsy, AwakeY =

which takes one of the three values at each point in time .T

By observing the dynamics of the variable in the past and present we would like to be able to predict its

value at time . This assumption that past states can influence future states creates the foundation for theT +1

probabilistic reasoning over time and forms the basis of the Markov Chains (Russell and Norvig, 1995).

Graphically the model is illustrated in Fig.3, and in this particular example it is a first order Markov Chain

where the future state is a function of a present state only — a phenomenon also known as a markovian property.

Figure 3. Markov Chain for Wakefulness State Transition Over Time

Using mathematical notation, this property can be expressed as a conditional independence:

(Y | Y ) P(Y | Y )P t 0 : t − 1 = t − 1

4 Wakefulness can be modeled as either discrete or continuous and the choice for selecting discrete representation is merely for convenience of modeling

Shegheva 6

In addition to the simplifying assumption of memorylessness of the process, we will also consider time as a

discrete variable where measurement of the wakefulness state will take place at discrete intervals of time — 5sec.,

10sec., 15sec., etc.

Within each time slice , the random variable is changing according to transition rules which remainT Y

stationary over time. In the simplistic example demonstrated in Fig. 4, the probability of getting drowsy is

and probability of waking up is .(drowsy | awake) 0.5P = (awake | asleep) 0.2P =

Figure 4. Markov Process example of the transition probabilities for Wakefulness signal

Modeling Wakefulness with Latent Variables

The described above model which adopts the position of direct observability of the wakefulness signal has

a strong limitation — it assumes the existence of the stream of annotated data mapping the actual wakefulness to its

encoded representation . In reality, we have to deal with the absence of the direct observations and build the model 5

from the sequential data emitted by the actual process.

A system where a signal source is either unavailable or costly to observe is commonly modeled with a

certain type of stochastic signal representation — Hidden Markov Model (HMM) which can provide a great amount

of insight about the source (Rabiner, 1989).

By treating the writing activities as a byproduct of wakefulness we can build both an inference and a

prediction system — Awakefulness State Learning Evidenced by Pattern of Typing — ASLEPT. As it appears, to

successfully explain the wakefulness state from the stream of observations we require a knowledge of:

transition rules describing the dynamics (laws) of the wakefulness state

emission rules describing the influence of the wakefulness state on the userproduced text quality

Various algorithms such as Viterbi and Baum–Welch have been universally applied to find the most likely

sequence of the hidden states, and are especially relevant in realtime systems (Ardö, 2007). The key components

5 This could be accomplished with high resolution video streams and advanced techniques from Computer Vision

Shegheva 7

required to be able to generate a timely warning about the next sleep wave are:

interpretation of the current typing pattern

inferring the wakefulness states by correlating them with the observations

predicting transitions in the wakefulness state

Let us consider another random variable which will map to the evidence directly observed from theX

typing activity. There is some level of flexibility in the encoding approach of the observation variable and some

literature focuses on the discussion of the important techniques for measuring error rates in user entered text —

keystrokes per character ( KSPC ) and characters per second ( CPS ) (Soukoreff et al, 2003). Some adaptation of

these techniques could be applied to the problem at hand, in particular, detecting quality of entered text in realtime

while writing an essay.

Connecting the previously defined variable for wakefulness with the variable for evidence, the modelY X

from Fig 4. is reconstructed with the additional layer as shown in Fig 5. The direct relationship between evidence

and the actual process is modelled by conditional probability and it expresses the causality whereX Y (X | Y )P t t

the quality of the typing depends on the current wakefulness state of the narcoleptic individual.

Figure 5. Temporal transitions between hidden narcolepsy states and direct observations.

The model topology when looked at from time slice demonstrated in Fig. 6 reflects the layered structureT

of the proposed hierarchical system. The observed layer — emission model — describes how the quality of the

entered text is affected by the individual’s state, i.e. being drowsy negatively correlates with speed of typing. The

inner hidden layer — transition model — describes the dynamics of the actual states, i.e. what is the likelihood of

being drowsy now if one has been drowsy for the last N observations.

Shegheva 8

Figure 6. Hidden Markov Model topology describing the system for observing text quality and inferring the wakefulness state

The above topology of the HMM also known as a Trellis diagram allows making certain types of inferences about

the underlying wakefulness process:

what is the probability of being in one of three states — awake, drowsy or asleep — given the sequential

observations of the entered text

how can we detect changes in the wakefulness state given the observations

what are the descriptive attributes of the individual’s narcolepsy — how often is the individual in any one

of the states; does any state prevail and at what times of the day

how can we visualize narcolepsy transitions in order to educate the individual about their specific and

unique behavior

Cognitive Process Modeling and Preliminary Results

Simulated Gaussian Process for Observations

An activity of writing an essay can be seen as a continuous stream of actions, such as a user entering

characters, or void actions for times when a user suspends typing. By periodically analyzing and summarizing the

outcome of actions we can create a text quality estimator with a function to provide a normalized scalar observable

variable for each interval of time when a measurement occurred. We will take the Bayesian approach to simulate

such observations since it allows us to express our own believability in an event (Patil et al, 2010).

To resemble the real physical process as much as possible it is preferable to add noise to the model of

observations. Gaussian Process (GP) provides a suitable foundation for simulating temporal patterns for

Shegheva 9

nonstationary time series of text quality observations. The desired form of the Gaussian process is achievable by

choosing hyperparameters for priors with properties which most reflect our belief about an underlying physical

process (BrahimBelhouari et al, 2004).

A simple hierarchical model for encoding our beliefs of text quality during the cognitive task of writing in

narcoleptic patients can be represented as follows:

— hyperparameters which control the shape of the prior distribution , β αstatei statei

— prior of the text quality RV expressed through Beta distribution eta (α , )μi ~ B statei βstatei 6

— standard deviation of the text quality RV Uniform(0, 1)σ =

— text quality RV modelled with Gaussian (Normal) distributionaussian (μ , )xi ~ G i 1σ2

Fig. 7 presents a graphical model for simulating the Gaussian Process for observations. Here, the simulated data xi

are the values which a text quality function takes at each interval of time.

Figure 7. Graphical model for simulating Gaussian Process of observations

Variance of the observations is controlled by the sigma parameter and it models our expectation of the data

volatility. At every simulated measurement, the observation is associated with a normally distributed variable with

the mean being the expected value of the text quality. The prior of the expected value is naturally modeled with

Beta distribution since it has an intuitive interpretation of the parameters alpha and beta as counts of prior successes

and prior failures (Lee et al, 2014).

The principal point in this simulation therefore is finding the appropriate values for the hyperparameters

alpha and beta which will vary depending on the wakefulness state (see Fig. 8). What the graph below shows is that

we expect the text quality to be high for high levels of wakefulness ( ), average for reduced levels of alertness.9μ ~ 0

6 RV here stands for a Random Variable

https://drive.draw.io/#G0Bx9YkK2x8G15MGNoSU1HSWMycDg

Shegheva 10

( ) and worst for low levels of wakefulness ( )..5μ ~ 0 .0μ ~ 0

Figure 8. Prior Beta distributions defined for the three states of wakefulness.

The described model produces results shown in Fig.9 which exhibit the desired pattern in the simulated

observations. Of course there are unlimited number of patterns which can be generated, however in the interest of

moving forward with the modeling aspect of wakefulness, we will pick the one below since it displays the nice

properties of sudden declines and risings of the observed variable.

Figure 9. Simulated observations of the Text Quality variable for the cognitive writing task in narcoleptics.

Realtime Wakefulness Inference with Gaussian Hidden Markov Model

There are many advantages in using Bayesian approach for making the inference over the sequential latent

data — one of the most appealing traits is the ability to estimate the parameters of the complex and analytically

intractable models with the help of computational methods like MCMC (Patil et al, 2010).

It is very important however to consider the convergence of the Bayesian Hidden Markov Model especially

for long sequences. Latent models with a large parameter space are notorious for getting stuck in local minima

Shegheva 11

which can in some cases be addressed by collecting more samples or lifting the time constraint by letting the

sampling algorithms run longer in order to better explore the parameter space.

In the experiment of inferring the wakefulness state over extended periods of time (highly parametric

model), receiving accurate results in real time was infeasible within the Bayesian setting. Learning the transition

and emission parameters with high predictive properties required more time than range within which the feedback to

the user would be still relevant.

To address this limitation, the model was designed in the nonbayesian setting where no prior distribution

over latent states was specified. Formally the model can be fully described as follows:

— number of latent states: Awake, Drowsy and Asleep 3N =

= — number of observations in the time frame with delay between each measurement 5 sec.T Δt5sec tΔ

— transition matrix between statesirichlet(β)φi=1..N , 1..N ~ D

— mean of observations associated with each stateμi=1..N

— variance of observations associated with each stateσ2i=1..N

— latent state at the measurement time ategorical(φ )xt=1..T ~ C xt−1 t

— observation value at the measurement time aussian(μ , )yt=1..T ~ G xt σ2xt t

Viterbi algorithm implemented in the hmmlearn library allows estimating the most likely sequence for the hidden

states — viterbi path — from the observed events in realtime (Andreas et al, 2014). Fig. 10 demonstrates results

from inference over the two hours of observations which were streamed from the Gaussian Process described in the

previous section.

Even just visual analysis of the results suggests a high quality of inference where dense streaks of green

color associated with the awake cognitive state produces a high quality of emissions. Similarly, the areas of yellow

and red color associated with drowsy and asleep cognitive states respectively, accurately follow the reduced level of

text quality. The model exhibits very small rampup time where latent states haven’t been fully calibrated and the

confusion matrix may have high values in the offdiagonal sections. 7

The visualization methodology deliberately combines observations and latent states into a single graph to

be as intuitive and simple as possible without demanding some degree of analytical skills from the end user.

7 More on the confusion matrix analysis to be discussed in the evaluation section.

Shegheva 12

Figure 10. RealTime inference of the wakefulness state 8

from text quality synthetic observations . 9

Based on the observation and inferred states, various metrics can be collected for interpreting and

summarizing the overall changes in the cognitive state. The simplest metric, also composed in real time, is shown in

Fig.11 and demonstrates the prevalence of any one of the cognitive states in addition to their association with the

various levels of the text quality.

And this is just the start of what type of insights can be extracted for learning. Additional analysis can

show the impact of additional factors on wakefulness, such as time of the day or the elapsed time of the performed

cognitive tasks. No correlation with external factors, such as the individual’s overall physical state, are considered

in this analysis; however, this limitation can be addressed by additional sensors which would collect more

observations from the environment.

When an insight is based on a probabilistic nature, it is necessary to provide some level of confidence in the

result of the inference task. There exist multiple quantitative methods to express the believability of the event and

8 It is difficult to capture the realtimeness of the inference with the static image. 9 The upper curve in blue color represents the stream of observations following the modelled Gaussian process which exhibits various levels of text quality. The lower band communicates changes in the wakefulness state in real time as new observations are received into the model. The encoding of the three states is mapped with three different colors: green for awake, yellow for drowsy and red for asleep. The graph is highly interactive which allows users to hover over the data points to see their measured values. Additionally, users can zoom in and out to explore different areas within the graph for either real time or post event analysis.

Shegheva 13

here, the preference is given to a simple graphical visualization of the Goodness of Fit expressed by the logarithmic

likelihood of the sequence of inferred hidden wakefulness states. The metric of likelihood visualized in Fig.11 does

not provide an objective answer about accuracy of the estimation nor about minimum number of observations

required for the task. Such subjectivity is inherent however in all stochastic environments and after some period of

adjustment it is relatively easy to be interpreted by the end user regardless of their analytics skill level.

Figure 10. Summary of the cognitive states and their association with various levels of text quality . 10

Figure 11. Logarithmic likelihood probability of inferred latent states . 11

Model Predictive Power Evaluation

According to statistician George Box, “all models are wrong, but some are useful”; in the quest of assessing

model fitness, various methods have taken place in research and posterior predictive check is amongst the most

commonly used ones (Gelman et al, 1996). So is the model implemented above good? And if it is good — in what

10 The graphs for summarizing wakefulness states receive data from real time inference, and sorts it according to latent states. End users can interact with data and inspect how the quality of the text threshold changes in the each of the defined cognitive states. 11 Logarithmic likelihood demonstrates the confidence of the model in the explained latent states with, again, a very small rampup time. As one can observe, the level of likelihood increases as time passes by, which is explained by the presence of more observations. At some point however, the model saturates (~1000 measurements) after which no significant increase is recorded.

Shegheva 14

ways is it so? By sampling from the posterior distribution and comparing it to the measured observations we can

estimate the model’s predictive power. Fig.12 demonstrates the capabilities of the current model and based on

visual inspection there are no systematic discrepancies between data used as observations and data sampled from the

model. There is a slight weakness around predicting the text quality in the 0.40.6 range within the current sample

however, overall the model seems to be making sense.

Figure 12. Predictive Posterior Check of the Wakefulness Inference Model

Normally in unsupervised learning, any kind of model metric would be vague since we are not comparing it

to any truth values. However, the simulated Gaussian Process was based on the target prior process which we

specifically designed to follow some predefined shapes. By reverse engineering the values of the target process and

constructing a contingency table (see Fig. 13), we can further visualize the performance of the model.

Figure 13. Confusion Matrix for inferring Wakefulness state 12

12 The confusion matrix demonstrated satisfactory accuracy values for inferring wakefulness states. Note the high values on the primary diagonal.

Shegheva 15

Andrew Gelman often emphasizes the power of graphical methods versus typical frequentist tests such as

pvalue where the former are more intuitive even if less objective (Gelman et al, 2013). The last evaluation of the

model fitness uses methodology described by Greenhill et al in 2011, as separation plots. Our conclusion is that

separation plots would be more suitable for bayesian modeling and general model comparison. In the current

scenario with a Hidden Markov Model inference, the amount of data is drowning out the individual

misclassifications thus making the separation plot not very informative.

Figure 14. Separation Plots for Inferring Wakefulness States 13

Wakefulness Metacognosis 14

The important question that we anticipate is: what is the learning goal of the research, and how does the

inference of wakefulness help achieve that goal? Modeling is just one aspect of the problem which in itself is by no

means a complete solution. The necessary step following computations is extracting the insights which can be

visualised in order to diagnose the individual's cognitive state and provide recommendations.

A Digital Assistant for Writing Activities — DAWA — serves a purpose of monitoring the current

wakefulness state and playing an active role as the metacognitive mentor. Current capabilities include realtime

state change detection and an extended ondemand summarization of the performance and overall tendency of

transitioning between states.

A future version of DAWA may initiate a short dialog with the individual by either asking a few questions

further probing the state, or making a joke based on the written content which can potentially lead the individual out

of the hypnotic trance. Strong emphasis is made on adding interactive capabilities to provide an authentic feel of an

13 A separation plot which demonstrates the model fitness through the use of a visual yardstick with a perfect model, separates high probabilities for actual occurrences of the event. 14 Here, the term metacognosis signifies diagnosing metacognition with the purpose of teaching narcoleptic patients to be cognizant of the temporal behavior of their symptoms.

Shegheva 16

AI assistant.

Personalization of the digital assistant with an avatar can lead to further gains of the positive effects from

metacognitive tutoring (Joyner, 2015). Fig.15 showcases an example of such a realization by bringing in a folklore

character — dwarf — with inspiration drawn from the creature’s mining skills . Without being too intrusive into 15

the already highly involved cognitive task of writing, DAWA silently monitors the changes in the inferred cognitive

states and express an insight by animated facial features (see images a. for Awake, b. for Drowsy and c. for Asleep).

a) b) c) Figure 15. Personalization of the metacognitive assistant DAWA portraying three different cognitive states:

a) Awake b) Drowsy c) Asleep

Another advantage in using a metacognitive tutor is the ability to adjust to the individual’s performance

knowledge base collected over an extended period of time. Heterogeneity of the symptoms in narcoleptic

individuals can be potentially further analyzed and compared with other individuals with or without neurological

disorders.

Since there are numerous activities available for a metacognitive tutor, some metacognitive tools have

taken the direction of creating a separate avatar for each type of activity. However, for narcoleptic individuals it is

beneficial to remain with a single avatar to preserve the attention on the current cognitive task. A metacognitive

tutor can automatically selfassign one of the six currently defined roles to reflect the nature of the assistance. The

Tab.1 lists current roles and their frequency of activation.

15 We draw the analogy between gemstone mining and insights mining for narcolepsy

Shegheva 17

Role Description Activation Frequency

Insights Monitor the results of inference tasks and detect changes in the cognitive state Constant Monitoring

Questions Post a question to the user when certain undesired behavior persists (eg. remaining in drowsy state)

Very low frequency to avoid distraction and the need for constant user input

Suggestions Recommend a different time to perform the cognitive tasks if a cognitive state declines drastically

Reduced frequency (every N measurements)

Actions Suspend typing with prompting a user with a dialog to continue to avoid document corruption

Intermittent and activated for inferred tasks when being asleep

Miscellaneous Retrieve inspirational quotes/jokes from local database to allow mental break during complex cognitive task

If the cognitive task has been performed for a while and degradation has been detected

Report Generation

Similarly to insights, tries to capture the change, however in this case it is a lot more extended version using Empirical Bayesian approach

Once at the end of each session

Table 1. List of current roles for the metacognitive tutor.

The metacognitive tutor DAWA has access to inference results and it activates one of the roles selected

based on the observed conditions. DAWA applies various statistical tools such as variance, central tendencies, and

linear regression to detect basic trends and to reduce some potential noise, especially at the very early stages of

inference when the model is not fully calibrated.

One of its roles, report generation, deserves a more detailed look as it develops a bayesian hierarchical

model by learning from the already inferred tasks. This approach with the name Empirical Bayes is commonly

described as a trick which bridges frequentist and Bayesian inference (DavidsonPilon, 2015). The essence of the

method is to inform your priors with observed data which violates the philosophical view of the Bayesian inference,

but is almost always helpful in the convergence process.

Earlier, during the description of the realtime inference model, we hinted of the challenge of achieving

timely results if we choose to use a purely Bayesian Hidden Markov Model. So, while not very applicable for

realtime parameter learning, it is still very useful for generating reports at the end of cognitive tasks to evaluate

overall performance and various tendencies. Fig.16 demonstrates the graphical model for learning wakefulness

transition parameters. The impact of the violation aspect is minimized by using a large stream of observations and

Shegheva 18

modeling the uncertainty of the priors with high values.

Figure 16. Graphical Representation of the Wakefulness Model

By fitting this model to the wakefulness state and running until the satisfactory level of convergence with

MCMC, we can then estimate the parameters of transition and report the likelihood of change from any current state.

This generates another source of insights which can also form the history of the performance of the cognitive tasks.

Fig.17 demonstrates a heatmap matrix of the wakefulness state transitions prior and post fitting. 16 17

Figure 17. Sampled Prior (left) and Posterior (right) Transition Matrices

The interesting insight based on the current sample only, is that the wakefulness process is very sticky . 18

16 Since it is not trivial to demonstrate a prior of the matrix, we just randomly sampled one for visualization purposes. 17 The Posterior transition matrix is a range of matrices, but for visualization purposes we averaged them across 10000 MCMC chains and a 5000 burn value and a thinning parameter equal to 10, to make sure we observe the good behavior of autocorrelation. 18 Stickiness means a property of remaining in the current state with low likelihood of switching.

https://drive.draw.io/#G0Bx9YkK2x8G15eDlWVVdRT2FVNWc

Shegheva 19

However, one must take care not to conclude this as a generalization of narcolepsy behavior. Rather than sampling

a single transition matrix, the beauty of the Bayesian approach is that it allows us to have a posterior distribution

over the transition matrices to show how likely it is for each transition to be observed. Fig.18 summarizes transition

values between all defined states — Awake, Drowsy and Asleep. This level of insight, while not very applicable for

realtime cognitive change estimation, serves a purpose of performance comparison across different cognitive tasks

or different individuals with varied stages of narcolepsy on the same tasks.

Figure 18. Posterior of the Wakefulness Transitions.

Discussion and Future Applications

The proposed methodology for inferencing wakefulness can be used for other neurological disorders in a

similar manner. Change is an inherent property of human behavior and is commonly tightly coupled with time — it

takes time for the change to manifest itself and be observed. In this research we were interested in investigating the

behavioral changes which did not have an instant trigger but instead had the characteristics of precursory observable

signals of behavior alteration. More precisely, we were looking for patterns in the sequence of events on various

time scales, and the previously described temporal reasoning is very suitable as a foundation for inferring a

multitude of symptoms with a probabilistic nature. 19

The approach described so far has a curious property — elasticity — where the solution can be naturally

stretched to adapt to various neurological disorders besides narcolepsy. The elastic characteric is visualized in

19 We do not mean to imply that all symptoms are probabilistic in nature. What makes them probabilistic/stochastic is our lack of knowledge of the full environment.

Shegheva 20

Fig.19 in the 2Dimensional space with a logarithmic time scale in the horizontal direction and the complexity of

observed evidences in the vertical direction.

One can imagine using a unique setting on a hypothetical zooming device for each type of neurological

disorder in order to detect cognitive state alterations. For narcolepsy, the position would be set on Seconds to

Minutes during which a portion of evidences, such as speed and accuracy of typing, is analyzed in the inference

tasks about the alertness state of the individual.

Figure 19. Elastic Property of the Proposed System with Hypothetical Adaptation to Other Disorders and

Challenges.

Let us examine different kinds of cognitive state changes which can benefit from similar modeling. A

delayed reaction (observable in the Minutes to Hours scale) is a common warning in painkiller medications, and

observing the individual's typing pattern, it is possible to quantify the effects of such medication on a per individual

basis. This can be expanded to help individuals selfassess their performance when using overthecounter drugs,

vitamins, supplements, and for prescribed medications the data observed can be used for feedback to their

physicians.

How about satisfying the curiosity of individuals with normal brain activity? By keeping historical data

(Months to Years), an individual can track their own performance over time allowing the possibility to selfdiagnose

a potential case of Alzheimer's. Of course the evidence does not have to be limited to the analysis of the typing

Shegheva 21

pattern — driven by the diagram in Fig.15, the system can employ a collection of different observations ranging

from time of day awareness (closer to nighttime the individual’s energy may be lower) to biometrics gathering (an

individual may be exhausted from relentless physical exercises).

Modeling any kind of neurological disorder would involve acquiring background on the general behavior of

its symptoms. If a representative symptom can be encoded as a signal observable directly or indirectly, then it is

possible to stretch the solution of stochastic modeling to adapt to a specific problem. To initiate research in the

desired areas it would be helpful to organize questions around several different aspects:

Identify the time scale where the signal is generally observable — how much time elapses between two

subsequent measurements?

Research signal properties and their efficient encoding for processing — Is the signal directly observable

through biometric devices and if not, can it be inferred by auxiliary measurement?

Estimate the complexity of the system collecting the signal — what is the minimum set of variables

required to make an accurate prediction?

What type of actions can be taken by an individual if such a prediction is made possible — what is the cost

of false negatives and false positives?

Evaluate the plausibility of discrete or continuous monitoring — is it sufficient to take measurements at

discrete intervals of time or does the behavior under question require continuous monitoring in real time?

Conclusions

In this research we presented the stochastic modeling approach that recognizes the current wakefulness

state in narcoleptic individuals in realtime. The wakefulness signal is assumed to be hidden and inference is made

through observations of the typing pattern. Due to the lack of real data, the temporal behavior of observations is

simulated with a Gaussian Process modeled with a Bayesian approach. The simulated stream of observations

representing the text quality variable is then processed to find the most likely sequence of hidden wakefulness states

through the use of Hidden Markov Modeling. The developed visualization technique with the intent of simplifying

the interpretation to the end user demonstrated sufficient accuracy of explaining the hidden wakefulness process.

The ultimate intention however is that the end user should not have to observe the graph looking for insights into the

symptoms’ behavior. This role is assigned to the metacognitive tutor with the function to analyze the trends,

variances, direction changes, and overall summaries. The current version of the metacognitive assistant is still in its

Shegheva 22

early stages and the future development of its capabilities might be just the right technology to educate and assist in

the learning process for individuals with narcolepsy and other neurological disorders. As a civilization we have

made many advances by having a voracious appetite for learning. Narcolepsy does not have to remain an obstacle

for experiencing the excitement, enthusiasm, and hope for continuing to learn and creating new things.

References

Peyron, Christelle, et al. "A mutation in a case of early onset narcolepsy and a generalized absence of hypocretin peptides in human narcoleptic brains. "Nature medicine 6.9 (2000): 991997.

Berro, Laís F., ergi B. Tufik, and Sergio Tufik. "A journey through narcolepsy diagnosis: From ICSD 1 to ICSD 3." Sleep Ssy." Neurologic clinics 14.3 (1996): 545571.

Won, Christine, et al. "Tliness of narcolepsy diagnosis." Journal of clinical sleep medicine: JCSM: official publication of the American Academy of Sleep Medicine 10.1 (2014): 89.

Thannickal, Thomas C., and Jerome M. Siegel. "Hypocretin/Orexin Pathology in Human Narcolepsy with and Without Cataplexy." Orexin and Sleep. Springer International Publishing, 2015. 289298.

Naumann, A., C. Bellebaum, and I. Daum. "Cognitive deficits in narcolepsy."Journal of sleep research 15.3 (2006): 329338.

Smith, Karen M., Sharon L. Merritt, and Felissa L. Cohen. "Can we predict cognitive impairments in persons with narcolepsy?." Loss, Grief & Care 5.34 (1992): 103113.

Lopes, Eduardo, et al. "Cataplexy as a side effect of modafinil in a patient without narcolepsy." Sleep Science 7.1 (2014): 4749.

Flavell, John H. "Metacognition and cognitive monitoring: A new area of cognitive–developmental inquiry." American psychologist 34.10 (1979): 906.

Remke, Anne, and Mariëlle Stoelinga, eds. Stochastic Model Checking: International Autumn School, ROCKS 2012, Vahrn, Italy, October 2226, 2012. Advanced Lectures. Vol. 8453. Springer, 2014.

Chater, Nick, Joshua B. Tenenbaum, and Alan Yuille. "Probabilistic models of cognition: Conceptual foundations." Trends in cognitive sciences 10.7 (2006): 287291.

Russell, Stuart, and Peter Norvig. "Artificial intelligence: a modern approach." 1995.

Patil, Anand, David Huard, and Christopher J. Fonnesbeck. "PyMC: Bayesian stochastic modelling in Python." Journal of statistical software 35.4 (2010): 1.

BrahimBelhouari, Sofiane, and Amine Bermak. "Gaussian process for nonstationary time series prediction." Computational Statistics & Data Analysis 47.4 (2004): 705712.

Lee, Michael D., and EricJan Wagenmakers. Bayesian cognitive modeling: A practical course. Cambridge University Press, 2014.

Rabiner, Lawrence R. "A tutorial on hidden Markov models and selected applications in speech recognition." Proceedings of the IEEE 77.2 (1989): 257286.

Shegheva 23

Ardö, Håkan, Kalle Åström, and Rikard Berthilsson. "Real time viterbi optimization of hidden markov models for multi target tracking." Motion and Video Computing, 2007. WMVC'07. IEEE Workshop on. IEEE, 2007.

Soukoreff, R. William, and I. Scott MacKenzie. "Metrics for text entry research: an evaluation of MSD and KSPC, and a new unified error metric." Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 2003.

Andreas Mueller, Andreas. Hmmlearn. Python library of algorithms for unsupervised learning and inference of Hidden Markov Models .2014. Print.

Joyner, David A. "Metacognitive Tutoring for InquiryDriven Modeling." (2015).

Box, George EP. "Robustness in the strategy of scientific model building."Robustness in statistics 1 (1979): 201236.

Gelman, Andrew, XiaoLi Meng, and Hal Stern. "Posterior predictive assessment of model fitness via realized discrepancies." Statistica sinica 6.4 (1996): 733760.

Gelman, Andrew, and Cosma Rohilla Shalizi. "Philosophy and the practice of Bayesian statistics." British Journal of Mathematical and Statistical Psychology66.1 (2013): 838.

Greenhill, Brian, Michael D. Ward, and Audrey Sacks. "The separation plot: A new visual method for evaluating the fit of binary models." American Journal of Political Science 55.4 (2011): 9911002.

David, Diana, et al. "JESTER: Joke Entertainment System with TriviaEmitted Responses." (2002).

inferring wakefulness in narcoleptic patients during ... · the aim of this research is to explore...

Documents