inferring wakefulness in narcoleptic patients during ... · the aim of this research is to explore...
TRANSCRIPT
Shegheva 1
Inferring Wakefulness in Narcoleptic Patients during Cognitive Tasks
Snejana Shegheva ([email protected])
The purpose of computing is insight, not the numbers.
Richard Hamming, 1962,
Numerical Methods for Scientists and Engineers
Abstract
The aim of this research is to explore an application of stochastic processes to the narcolepsy neurological disorder in order to optimize the individual’s learning style. A cognitive task such as writing demands attention spans often difficult to achieve for narcoleptics. In this study we investigate the possibility of probabilistic modeling of the wakefulness with the help of Hidden Markov Models.
A proposed approach involves creating a hybrid metacognitiveprobabilistic system ASLEPT which monitors the cognitive state 1
of individuals with narcolepsy in realtime through inference tasks by observing their typing patterns. Results of the model are processed by the digital assistant DAWA to drive insights about the temporal symptoms behavior unique to the individual. 2
Methodology developed in the current study is bridging areas of neuroscience, machine learning, probabilistic modeling, visualization and metacognition with the hope of having a high impact on the learning process for individuals with narcolepsy disorders and perhaps many others.
Keywords: narcolepsy, cognitive tasks, probabilistic models, inference, metacognition, visualization
Introduction
Narcolepsy is a neurological sleep disorder characterized by abnormalities of the sleepwake cycle caused
by hypocretin deficiency in the brain (Peyron, 2000). The main symptoms include excessive daytime sleepiness,
cataplexy, hypnotic hallucinations, sleep paralysis, automatic behavior and disrupted nighttime sleep (Berro et al,
2014). Although narcolepsy was first described in the 19 century (Bassetti et al, 1996), the exact pathophysiological
mechanism has not been fully uncovered which makes the diagnosis process imprecise and often delayed (Won et
al, 2014). The complexity of this disorder has gained more visibility in recent years in neuroscience communities
which has helped drive the research towards better understanding of the nature of the brain’s abnormalities
(Thannickal et al 2015).
1 ASLEPT — acronym for Awakefulness State Learning Evidenced by Pattern of Typing 2 DAWA — acronym for Digital Assistant for Writing Activities
Shegheva 2
Multiple investigations have been aimed at estimating the cognitive deficit in narcoleptic patients reporting
impairment in attention spans and executive control tasks (Naumann et al, 2006). It has been noted that narcoleptic
sufferers are more likely to report concentration and learning difficulties especially on nonroutine tasks (Smith et
al, 1992). A decreased ability to focus on more complex cognitive tasks has been largely attributed to the necessity
of continuous cognitive resource allocation to vigilance monitoring (Naumann et al, 2006).
All the research findings suggest a challenge in the learning process for narcoleptic individuals where it
requires maintaining a high level of concentration and focus. With this research we seek to answer the question if a
technology can provide adaptive mechanisms to improve the learning process for individuals with narcolepsy.
We hope to take the first steps in creating a personalized application with initial focus on improving the
experience during essay writing — via learning about and educating the individual on their changes in typing
patterns, and alter or adjust specific habits if necessary. The goal is to empower people with neurological disorders
to take control of their learning strategies by improving their understanding of their symptoms’ temporal behavior.
Obstacles in Essay Writing for Narcoleptic Patients
“ Writer's block is a condition ... in which an author loses the ability to produce new work” [Wikipedia,
retrieved 2015, August 27]. For individuals with narcolepsy this block is significantly harder to overcome where
inability “to produce new work” is generally prompted by symptoms attributed to this sleeping disorder. It is
challenging and nearly impossible for narcoleptic people to anticipate the exact timing of the next wave of
drowsiness and the irresistible urge to sleep. In an overwhelming number of cases, by the time an individual realizes
that he/she shifted from wakefulness to the first phase of REM, the motor activities have been drastically reduced
(Lopes et al, 2014). Such a cataplectic episode commonly causes the phenomenon observed in Fig 1.
The outcome of interrupted writing activity may vary from unfortunate and unrecoverable corruption of the
current document to a loss of ideas inflicted by frequent transitions between being alert and drowsy. While we
cannot hope to recover the lost ideas due to their intangible nature, we can minimize the chance of losing the results
of the efforts put so far.
Shegheva 3
Figure 1. Writing activity in the middle of a narcoleptic episode 3
(Image courtesy of Michael Maldonado’s personal experience)
Metacognitive Intervention
The goal of the research is to build an adaptive model trained on the narcoleptic individual’s unique pattern
of typing. This in turn would allow an accurate prediction of the incoming wave of sleepiness and in general
provide an insight on the symptoms’ behavior.
A continuous realtime monitoring and feedback are generalized by the metacognitive framework aiming to
help individuals recognise the symptoms beforehand. Metacognition is the art of thinking about thinking and as
such is often described as a means to learn about self strengths and weaknesses. Individuals can learn about
different approaches best suited to their own unique learning capabilities — metacognitive knowledge — and
monitor their performance under different circumstances in order to adjust strategies for optimal learning —
metacognitive regulation. A further subdivision of the metacognitive knowledge into three variables — person, task
and strategy (Flavel et al, 1979) — can be mapped to the cognitive task of writing essays. An individual with
narcolepsy (person variable) is aware of the general difficulties writing an essay (task variable) due to the disruptive
nature of the symptoms. What coupled with when, can a narcoleptic do something (strategy variable) in order to
accomplish the task? When we think in terms of these variables functioning independently we lack the ability to
strike in a timely fashion — there is no warning in advance of the incoming symptom in order to take any action.
At the intersection of the three variables as visualized in Fig.2 there is an opportunity for an educational
3 The image demonstrates a meaningless sequence of characters most likely caused by muscle weakness in narcoleptic patients during the cognitive task.
Shegheva 4
technology to help a narcoleptic monitor and regulate their learning process. An application which can accurately
model the wakefulness state with unique characteristics for each individual can provide a timely warning to avoid
corruption of the document and subsequent frustration.
Figure 2. Venn Diagram of the elements of metacognitive knowledge
Metacognition may take multiple forms and in recent years Bayesian theories have attracted more attention
in attempts to describe various cognitive processes (Lee et al, 2014). Bayesian principles of inferences have shown
success in modeling human cognition and it would be a good foundation for reasoning about symptom behavior of
neurological disorders such as narcolepsy.
Latent Model for Wakefulness Inference
Motivation for Probabilistic Programming
As mentioned in the above sections, narcolepsy is a complex phenomenon with almost unique
manifestation of symptoms per individual. Let us take a most common symptom — frequent daily sleep attacks
(EDS excessive daytime sleepiness) — and see if we can accurately describe the process starting with the
assumption of its deterministic nature.
If we know all the variables which cause EDS, then we should be able to exactly predict the next wave of
drowsiness. However, here lies the problem — there are too many variables, such as individual's age, the
profoundness of the disorder, type and amount of medication taken, time of day, general physical state of the
individual, etc. and etc.
This makes the problem intractable very quickly due to a complex interleaving of the intrinsic properties of
each variable. Their causal relationship however can be captured with stochastic models which have rapidly gained
popularity due to their power and systematic approach for modeling the real physical processes which carry some
amount of uncertainty (Remke et al, 2014). Thus, by expressing the transitions of wakefulness states through
Shegheva 5
probabilistic variables we have a better chance of extracting significant insights from the narcoleptic symptoms
behavior which in turn would allow a better control in the more complex and involved cognitive tasks.
Furthermore, by thinking about mathematical structures and algorithms in conjunction with the cognitive
processes of the mind we can develop sophisticated tools to better understand the complex human cognition as a
physical process (Chater et al, 2006).
Modeling Wakefulness with Directly Observable Variables
In the context of narcolepsy the real physical process which we are interested in is wakefulness
characterized as a signal which produces observations. We will start with the discrete representation of the signal 4
through the random variable — Y
Asleep, Drowsy, AwakeY =
which takes one of the three values at each point in time .T
By observing the dynamics of the variable in the past and present we would like to be able to predict its
value at time . This assumption that past states can influence future states creates the foundation for theT +1
probabilistic reasoning over time and forms the basis of the Markov Chains (Russell and Norvig, 1995).
Graphically the model is illustrated in Fig.3, and in this particular example it is a first order Markov Chain
where the future state is a function of a present state only — a phenomenon also known as a markovian property.
Figure 3. Markov Chain for Wakefulness State Transition Over Time
Using mathematical notation, this property can be expressed as a conditional independence:
(Y | Y ) P(Y | Y )P t 0 : t − 1 = t − 1
4 Wakefulness can be modeled as either discrete or continuous and the choice for selecting discrete representation is merely for convenience of modeling
Shegheva 6
In addition to the simplifying assumption of memorylessness of the process, we will also consider time as a
discrete variable where measurement of the wakefulness state will take place at discrete intervals of time — 5sec.,
10sec., 15sec., etc.
Within each time slice , the random variable is changing according to transition rules which remainT Y
stationary over time. In the simplistic example demonstrated in Fig. 4, the probability of getting drowsy is
and probability of waking up is .(drowsy | awake) 0.5P = (awake | asleep) 0.2P =
Figure 4. Markov Process example of the transition probabilities for Wakefulness signal
Modeling Wakefulness with Latent Variables
The described above model which adopts the position of direct observability of the wakefulness signal has
a strong limitation — it assumes the existence of the stream of annotated data mapping the actual wakefulness to its
encoded representation . In reality, we have to deal with the absence of the direct observations and build the model 5
from the sequential data emitted by the actual process.
A system where a signal source is either unavailable or costly to observe is commonly modeled with a
certain type of stochastic signal representation — Hidden Markov Model (HMM) which can provide a great amount
of insight about the source (Rabiner, 1989).
By treating the writing activities as a byproduct of wakefulness we can build both an inference and a
prediction system — Awakefulness State Learning Evidenced by Pattern of Typing — ASLEPT. As it appears, to
successfully explain the wakefulness state from the stream of observations we require a knowledge of:
transition rules describing the dynamics (laws) of the wakefulness state
emission rules describing the influence of the wakefulness state on the userproduced text quality
Various algorithms such as Viterbi and Baum–Welch have been universally applied to find the most likely
sequence of the hidden states, and are especially relevant in realtime systems (Ardö, 2007). The key components
5 This could be accomplished with high resolution video streams and advanced techniques from Computer Vision
Shegheva 7
required to be able to generate a timely warning about the next sleep wave are:
interpretation of the current typing pattern
inferring the wakefulness states by correlating them with the observations
predicting transitions in the wakefulness state
Let us consider another random variable which will map to the evidence directly observed from theX
typing activity. There is some level of flexibility in the encoding approach of the observation variable and some
literature focuses on the discussion of the important techniques for measuring error rates in user entered text —
keystrokes per character ( KSPC ) and characters per second ( CPS ) (Soukoreff et al, 2003). Some adaptation of
these techniques could be applied to the problem at hand, in particular, detecting quality of entered text in realtime
while writing an essay.
Connecting the previously defined variable for wakefulness with the variable for evidence, the modelY X
from Fig 4. is reconstructed with the additional layer as shown in Fig 5. The direct relationship between evidence
and the actual process is modelled by conditional probability and it expresses the causality whereX Y (X | Y )P t t
the quality of the typing depends on the current wakefulness state of the narcoleptic individual.
Figure 5. Temporal transitions between hidden narcolepsy states and direct observations.
The model topology when looked at from time slice demonstrated in Fig. 6 reflects the layered structureT
of the proposed hierarchical system. The observed layer — emission model — describes how the quality of the
entered text is affected by the individual’s state, i.e. being drowsy negatively correlates with speed of typing. The
inner hidden layer — transition model — describes the dynamics of the actual states, i.e. what is the likelihood of
being drowsy now if one has been drowsy for the last N observations.
Shegheva 8
Figure 6. Hidden Markov Model topology describing the system for observing text quality and inferring the wakefulness state
The above topology of the HMM also known as a Trellis diagram allows making certain types of inferences about
the underlying wakefulness process:
what is the probability of being in one of three states — awake, drowsy or asleep — given the sequential
observations of the entered text
how can we detect changes in the wakefulness state given the observations
what are the descriptive attributes of the individual’s narcolepsy — how often is the individual in any one
of the states; does any state prevail and at what times of the day
how can we visualize narcolepsy transitions in order to educate the individual about their specific and
unique behavior
Cognitive Process Modeling and Preliminary Results
Simulated Gaussian Process for Observations
An activity of writing an essay can be seen as a continuous stream of actions, such as a user entering
characters, or void actions for times when a user suspends typing. By periodically analyzing and summarizing the
outcome of actions we can create a text quality estimator with a function to provide a normalized scalar observable
variable for each interval of time when a measurement occurred. We will take the Bayesian approach to simulate
such observations since it allows us to express our own believability in an event (Patil et al, 2010).
To resemble the real physical process as much as possible it is preferable to add noise to the model of
observations. Gaussian Process (GP) provides a suitable foundation for simulating temporal patterns for
Shegheva 9
nonstationary time series of text quality observations. The desired form of the Gaussian process is achievable by
choosing hyperparameters for priors with properties which most reflect our belief about an underlying physical
process (BrahimBelhouari et al, 2004).
A simple hierarchical model for encoding our beliefs of text quality during the cognitive task of writing in
narcoleptic patients can be represented as follows:
— hyperparameters which control the shape of the prior distribution , β αstatei statei
— prior of the text quality RV expressed through Beta distribution eta (α , )μi ~ B statei βstatei 6
— standard deviation of the text quality RV Uniform(0, 1)σ =
— text quality RV modelled with Gaussian (Normal) distributionaussian (μ , )xi ~ G i 1σ2
Fig. 7 presents a graphical model for simulating the Gaussian Process for observations. Here, the simulated data xi
are the values which a text quality function takes at each interval of time.
Figure 7. Graphical model for simulating Gaussian Process of observations
Variance of the observations is controlled by the sigma parameter and it models our expectation of the data
volatility. At every simulated measurement, the observation is associated with a normally distributed variable with
the mean being the expected value of the text quality. The prior of the expected value is naturally modeled with
Beta distribution since it has an intuitive interpretation of the parameters alpha and beta as counts of prior successes
and prior failures (Lee et al, 2014).
The principal point in this simulation therefore is finding the appropriate values for the hyperparameters
alpha and beta which will vary depending on the wakefulness state (see Fig. 8). What the graph below shows is that
we expect the text quality to be high for high levels of wakefulness ( ), average for reduced levels of alertness.9μ ~ 0
6 RV here stands for a Random Variable
Shegheva 10
( ) and worst for low levels of wakefulness ( )..5μ ~ 0 .0μ ~ 0
Figure 8. Prior Beta distributions defined for the three states of wakefulness.
The described model produces results shown in Fig.9 which exhibit the desired pattern in the simulated
observations. Of course there are unlimited number of patterns which can be generated, however in the interest of
moving forward with the modeling aspect of wakefulness, we will pick the one below since it displays the nice
properties of sudden declines and risings of the observed variable.
Figure 9. Simulated observations of the Text Quality variable for the cognitive writing task in narcoleptics.
Realtime Wakefulness Inference with Gaussian Hidden Markov Model
There are many advantages in using Bayesian approach for making the inference over the sequential latent
data — one of the most appealing traits is the ability to estimate the parameters of the complex and analytically
intractable models with the help of computational methods like MCMC (Patil et al, 2010).
It is very important however to consider the convergence of the Bayesian Hidden Markov Model especially
for long sequences. Latent models with a large parameter space are notorious for getting stuck in local minima
Shegheva 11
which can in some cases be addressed by collecting more samples or lifting the time constraint by letting the
sampling algorithms run longer in order to better explore the parameter space.
In the experiment of inferring the wakefulness state over extended periods of time (highly parametric
model), receiving accurate results in real time was infeasible within the Bayesian setting. Learning the transition
and emission parameters with high predictive properties required more time than range within which the feedback to
the user would be still relevant.
To address this limitation, the model was designed in the nonbayesian setting where no prior distribution
over latent states was specified. Formally the model can be fully described as follows:
— number of latent states: Awake, Drowsy and Asleep 3N =
= — number of observations in the time frame with delay between each measurement 5 sec.T Δt5sec tΔ
— transition matrix between statesirichlet(β)φi=1..N , 1..N ~ D
— mean of observations associated with each stateμi=1..N
— variance of observations associated with each stateσ2i=1..N
— latent state at the measurement time ategorical(φ )xt=1..T ~ C xt−1 t
— observation value at the measurement time aussian(μ , )yt=1..T ~ G xt σ2xt t
Viterbi algorithm implemented in the hmmlearn library allows estimating the most likely sequence for the hidden
states — viterbi path — from the observed events in realtime (Andreas et al, 2014). Fig. 10 demonstrates results
from inference over the two hours of observations which were streamed from the Gaussian Process described in the
previous section.
Even just visual analysis of the results suggests a high quality of inference where dense streaks of green
color associated with the awake cognitive state produces a high quality of emissions. Similarly, the areas of yellow
and red color associated with drowsy and asleep cognitive states respectively, accurately follow the reduced level of
text quality. The model exhibits very small rampup time where latent states haven’t been fully calibrated and the
confusion matrix may have high values in the offdiagonal sections. 7
The visualization methodology deliberately combines observations and latent states into a single graph to
be as intuitive and simple as possible without demanding some degree of analytical skills from the end user.
7 More on the confusion matrix analysis to be discussed in the evaluation section.
Shegheva 12
Figure 10. RealTime inference of the wakefulness state 8
from text quality synthetic observations . 9
Based on the observation and inferred states, various metrics can be collected for interpreting and
summarizing the overall changes in the cognitive state. The simplest metric, also composed in real time, is shown in
Fig.11 and demonstrates the prevalence of any one of the cognitive states in addition to their association with the
various levels of the text quality.
And this is just the start of what type of insights can be extracted for learning. Additional analysis can
show the impact of additional factors on wakefulness, such as time of the day or the elapsed time of the performed
cognitive tasks. No correlation with external factors, such as the individual’s overall physical state, are considered
in this analysis; however, this limitation can be addressed by additional sensors which would collect more
observations from the environment.
When an insight is based on a probabilistic nature, it is necessary to provide some level of confidence in the
result of the inference task. There exist multiple quantitative methods to express the believability of the event and
8 It is difficult to capture the realtimeness of the inference with the static image. 9 The upper curve in blue color represents the stream of observations following the modelled Gaussian process which exhibits various levels of text quality. The lower band communicates changes in the wakefulness state in real time as new observations are received into the model. The encoding of the three states is mapped with three different colors: green for awake, yellow for drowsy and red for asleep. The graph is highly interactive which allows users to hover over the data points to see their measured values. Additionally, users can zoom in and out to explore different areas within the graph for either real time or post event analysis.
Shegheva 13
here, the preference is given to a simple graphical visualization of the Goodness of Fit expressed by the logarithmic
likelihood of the sequence of inferred hidden wakefulness states. The metric of likelihood visualized in Fig.11 does
not provide an objective answer about accuracy of the estimation nor about minimum number of observations
required for the task. Such subjectivity is inherent however in all stochastic environments and after some period of
adjustment it is relatively easy to be interpreted by the end user regardless of their analytics skill level.
Figure 10. Summary of the cognitive states and their association with various levels of text quality . 10
Figure 11. Logarithmic likelihood probability of inferred latent states . 11
Model Predictive Power Evaluation
According to statistician George Box, “all models are wrong, but some are useful”; in the quest of assessing
model fitness, various methods have taken place in research and posterior predictive check is amongst the most
commonly used ones (Gelman et al, 1996). So is the model implemented above good? And if it is good — in what
10 The graphs for summarizing wakefulness states receive data from real time inference, and sorts it according to latent states. End users can interact with data and inspect how the quality of the text threshold changes in the each of the defined cognitive states. 11 Logarithmic likelihood demonstrates the confidence of the model in the explained latent states with, again, a very small rampup time. As one can observe, the level of likelihood increases as time passes by, which is explained by the presence of more observations. At some point however, the model saturates (~1000 measurements) after which no significant increase is recorded.
Shegheva 14
ways is it so? By sampling from the posterior distribution and comparing it to the measured observations we can
estimate the model’s predictive power. Fig.12 demonstrates the capabilities of the current model and based on
visual inspection there are no systematic discrepancies between data used as observations and data sampled from the
model. There is a slight weakness around predicting the text quality in the 0.40.6 range within the current sample
however, overall the model seems to be making sense.
Figure 12. Predictive Posterior Check of the Wakefulness Inference Model
Normally in unsupervised learning, any kind of model metric would be vague since we are not comparing it
to any truth values. However, the simulated Gaussian Process was based on the target prior process which we
specifically designed to follow some predefined shapes. By reverse engineering the values of the target process and
constructing a contingency table (see Fig. 13), we can further visualize the performance of the model.
Figure 13. Confusion Matrix for inferring Wakefulness state 12
12 The confusion matrix demonstrated satisfactory accuracy values for inferring wakefulness states. Note the high values on the primary diagonal.
Shegheva 15
Andrew Gelman often emphasizes the power of graphical methods versus typical frequentist tests such as
pvalue where the former are more intuitive even if less objective (Gelman et al, 2013). The last evaluation of the
model fitness uses methodology described by Greenhill et al in 2011, as separation plots. Our conclusion is that
separation plots would be more suitable for bayesian modeling and general model comparison. In the current
scenario with a Hidden Markov Model inference, the amount of data is drowning out the individual
misclassifications thus making the separation plot not very informative.
Figure 14. Separation Plots for Inferring Wakefulness States 13
Wakefulness Metacognosis 14
The important question that we anticipate is: what is the learning goal of the research, and how does the
inference of wakefulness help achieve that goal? Modeling is just one aspect of the problem which in itself is by no
means a complete solution. The necessary step following computations is extracting the insights which can be
visualised in order to diagnose the individual's cognitive state and provide recommendations.
A Digital Assistant for Writing Activities — DAWA — serves a purpose of monitoring the current
wakefulness state and playing an active role as the metacognitive mentor. Current capabilities include realtime
state change detection and an extended ondemand summarization of the performance and overall tendency of
transitioning between states.
A future version of DAWA may initiate a short dialog with the individual by either asking a few questions
further probing the state, or making a joke based on the written content which can potentially lead the individual out
of the hypnotic trance. Strong emphasis is made on adding interactive capabilities to provide an authentic feel of an
13 A separation plot which demonstrates the model fitness through the use of a visual yardstick with a perfect model, separates high probabilities for actual occurrences of the event. 14 Here, the term metacognosis signifies diagnosing metacognition with the purpose of teaching narcoleptic patients to be cognizant of the temporal behavior of their symptoms.
Shegheva 16
AI assistant.
Personalization of the digital assistant with an avatar can lead to further gains of the positive effects from
metacognitive tutoring (Joyner, 2015). Fig.15 showcases an example of such a realization by bringing in a folklore
character — dwarf — with inspiration drawn from the creature’s mining skills . Without being too intrusive into 15
the already highly involved cognitive task of writing, DAWA silently monitors the changes in the inferred cognitive
states and express an insight by animated facial features (see images a. for Awake, b. for Drowsy and c. for Asleep).
a) b) c) Figure 15. Personalization of the metacognitive assistant DAWA portraying three different cognitive states:
a) Awake b) Drowsy c) Asleep
Another advantage in using a metacognitive tutor is the ability to adjust to the individual’s performance
knowledge base collected over an extended period of time. Heterogeneity of the symptoms in narcoleptic
individuals can be potentially further analyzed and compared with other individuals with or without neurological
disorders.
Since there are numerous activities available for a metacognitive tutor, some metacognitive tools have
taken the direction of creating a separate avatar for each type of activity. However, for narcoleptic individuals it is
beneficial to remain with a single avatar to preserve the attention on the current cognitive task. A metacognitive
tutor can automatically selfassign one of the six currently defined roles to reflect the nature of the assistance. The
Tab.1 lists current roles and their frequency of activation.
15 We draw the analogy between gemstone mining and insights mining for narcolepsy
Shegheva 17
Role Description Activation Frequency
Insights Monitor the results of inference tasks and detect changes in the cognitive state Constant Monitoring
Questions Post a question to the user when certain undesired behavior persists (eg. remaining in drowsy state)
Very low frequency to avoid distraction and the need for constant user input
Suggestions Recommend a different time to perform the cognitive tasks if a cognitive state declines drastically
Reduced frequency (every N measurements)
Actions Suspend typing with prompting a user with a dialog to continue to avoid document corruption
Intermittent and activated for inferred tasks when being asleep
Miscellaneous Retrieve inspirational quotes/jokes from local database to allow mental break during complex cognitive task
If the cognitive task has been performed for a while and degradation has been detected
Report Generation
Similarly to insights, tries to capture the change, however in this case it is a lot more extended version using Empirical Bayesian approach
Once at the end of each session
Table 1. List of current roles for the metacognitive tutor.
The metacognitive tutor DAWA has access to inference results and it activates one of the roles selected
based on the observed conditions. DAWA applies various statistical tools such as variance, central tendencies, and
linear regression to detect basic trends and to reduce some potential noise, especially at the very early stages of
inference when the model is not fully calibrated.
One of its roles, report generation, deserves a more detailed look as it develops a bayesian hierarchical
model by learning from the already inferred tasks. This approach with the name Empirical Bayes is commonly
described as a trick which bridges frequentist and Bayesian inference (DavidsonPilon, 2015). The essence of the
method is to inform your priors with observed data which violates the philosophical view of the Bayesian inference,
but is almost always helpful in the convergence process.
Earlier, during the description of the realtime inference model, we hinted of the challenge of achieving
timely results if we choose to use a purely Bayesian Hidden Markov Model. So, while not very applicable for
realtime parameter learning, it is still very useful for generating reports at the end of cognitive tasks to evaluate
overall performance and various tendencies. Fig.16 demonstrates the graphical model for learning wakefulness
transition parameters. The impact of the violation aspect is minimized by using a large stream of observations and
Shegheva 18
modeling the uncertainty of the priors with high values.
Figure 16. Graphical Representation of the Wakefulness Model
By fitting this model to the wakefulness state and running until the satisfactory level of convergence with
MCMC, we can then estimate the parameters of transition and report the likelihood of change from any current state.
This generates another source of insights which can also form the history of the performance of the cognitive tasks.
Fig.17 demonstrates a heatmap matrix of the wakefulness state transitions prior and post fitting. 16 17
Figure 17. Sampled Prior (left) and Posterior (right) Transition Matrices
The interesting insight based on the current sample only, is that the wakefulness process is very sticky . 18
16 Since it is not trivial to demonstrate a prior of the matrix, we just randomly sampled one for visualization purposes. 17 The Posterior transition matrix is a range of matrices, but for visualization purposes we averaged them across 10000 MCMC chains and a 5000 burn value and a thinning parameter equal to 10, to make sure we observe the good behavior of autocorrelation. 18 Stickiness means a property of remaining in the current state with low likelihood of switching.
Shegheva 19
However, one must take care not to conclude this as a generalization of narcolepsy behavior. Rather than sampling
a single transition matrix, the beauty of the Bayesian approach is that it allows us to have a posterior distribution
over the transition matrices to show how likely it is for each transition to be observed. Fig.18 summarizes transition
values between all defined states — Awake, Drowsy and Asleep. This level of insight, while not very applicable for
realtime cognitive change estimation, serves a purpose of performance comparison across different cognitive tasks
or different individuals with varied stages of narcolepsy on the same tasks.
Figure 18. Posterior of the Wakefulness Transitions.
Discussion and Future Applications
The proposed methodology for inferencing wakefulness can be used for other neurological disorders in a
similar manner. Change is an inherent property of human behavior and is commonly tightly coupled with time — it
takes time for the change to manifest itself and be observed. In this research we were interested in investigating the
behavioral changes which did not have an instant trigger but instead had the characteristics of precursory observable
signals of behavior alteration. More precisely, we were looking for patterns in the sequence of events on various
time scales, and the previously described temporal reasoning is very suitable as a foundation for inferring a
multitude of symptoms with a probabilistic nature. 19
The approach described so far has a curious property — elasticity — where the solution can be naturally
stretched to adapt to various neurological disorders besides narcolepsy. The elastic characteric is visualized in
19 We do not mean to imply that all symptoms are probabilistic in nature. What makes them probabilistic/stochastic is our lack of knowledge of the full environment.
Shegheva 20
Fig.19 in the 2Dimensional space with a logarithmic time scale in the horizontal direction and the complexity of
observed evidences in the vertical direction.
One can imagine using a unique setting on a hypothetical zooming device for each type of neurological
disorder in order to detect cognitive state alterations. For narcolepsy, the position would be set on Seconds to
Minutes during which a portion of evidences, such as speed and accuracy of typing, is analyzed in the inference
tasks about the alertness state of the individual.
Figure 19. Elastic Property of the Proposed System with Hypothetical Adaptation to Other Disorders and
Challenges.
Let us examine different kinds of cognitive state changes which can benefit from similar modeling. A
delayed reaction (observable in the Minutes to Hours scale) is a common warning in painkiller medications, and
observing the individual's typing pattern, it is possible to quantify the effects of such medication on a per individual
basis. This can be expanded to help individuals selfassess their performance when using overthecounter drugs,
vitamins, supplements, and for prescribed medications the data observed can be used for feedback to their
physicians.
How about satisfying the curiosity of individuals with normal brain activity? By keeping historical data
(Months to Years), an individual can track their own performance over time allowing the possibility to selfdiagnose
a potential case of Alzheimer's. Of course the evidence does not have to be limited to the analysis of the typing
Shegheva 21
pattern — driven by the diagram in Fig.15, the system can employ a collection of different observations ranging
from time of day awareness (closer to nighttime the individual’s energy may be lower) to biometrics gathering (an
individual may be exhausted from relentless physical exercises).
Modeling any kind of neurological disorder would involve acquiring background on the general behavior of
its symptoms. If a representative symptom can be encoded as a signal observable directly or indirectly, then it is
possible to stretch the solution of stochastic modeling to adapt to a specific problem. To initiate research in the
desired areas it would be helpful to organize questions around several different aspects:
Identify the time scale where the signal is generally observable — how much time elapses between two
subsequent measurements?
Research signal properties and their efficient encoding for processing — Is the signal directly observable
through biometric devices and if not, can it be inferred by auxiliary measurement?
Estimate the complexity of the system collecting the signal — what is the minimum set of variables
required to make an accurate prediction?
What type of actions can be taken by an individual if such a prediction is made possible — what is the cost
of false negatives and false positives?
Evaluate the plausibility of discrete or continuous monitoring — is it sufficient to take measurements at
discrete intervals of time or does the behavior under question require continuous monitoring in real time?
Conclusions
In this research we presented the stochastic modeling approach that recognizes the current wakefulness
state in narcoleptic individuals in realtime. The wakefulness signal is assumed to be hidden and inference is made
through observations of the typing pattern. Due to the lack of real data, the temporal behavior of observations is
simulated with a Gaussian Process modeled with a Bayesian approach. The simulated stream of observations
representing the text quality variable is then processed to find the most likely sequence of hidden wakefulness states
through the use of Hidden Markov Modeling. The developed visualization technique with the intent of simplifying
the interpretation to the end user demonstrated sufficient accuracy of explaining the hidden wakefulness process.
The ultimate intention however is that the end user should not have to observe the graph looking for insights into the
symptoms’ behavior. This role is assigned to the metacognitive tutor with the function to analyze the trends,
variances, direction changes, and overall summaries. The current version of the metacognitive assistant is still in its
Shegheva 22
early stages and the future development of its capabilities might be just the right technology to educate and assist in
the learning process for individuals with narcolepsy and other neurological disorders. As a civilization we have
made many advances by having a voracious appetite for learning. Narcolepsy does not have to remain an obstacle
for experiencing the excitement, enthusiasm, and hope for continuing to learn and creating new things.
References
Peyron, Christelle, et al. "A mutation in a case of early onset narcolepsy and a generalized absence of hypocretin peptides in human narcoleptic brains. "Nature medicine 6.9 (2000): 991997.
Berro, Laís F., ergi B. Tufik, and Sergio Tufik. "A journey through narcolepsy diagnosis: From ICSD 1 to ICSD 3." Sleep Ssy." Neurologic clinics 14.3 (1996): 545571.
Won, Christine, et al. "Tliness of narcolepsy diagnosis." Journal of clinical sleep medicine: JCSM: official publication of the American Academy of Sleep Medicine 10.1 (2014): 89.
Thannickal, Thomas C., and Jerome M. Siegel. "Hypocretin/Orexin Pathology in Human Narcolepsy with and Without Cataplexy." Orexin and Sleep. Springer International Publishing, 2015. 289298.
Naumann, A., C. Bellebaum, and I. Daum. "Cognitive deficits in narcolepsy."Journal of sleep research 15.3 (2006): 329338.
Smith, Karen M., Sharon L. Merritt, and Felissa L. Cohen. "Can we predict cognitive impairments in persons with narcolepsy?." Loss, Grief & Care 5.34 (1992): 103113.
Lopes, Eduardo, et al. "Cataplexy as a side effect of modafinil in a patient without narcolepsy." Sleep Science 7.1 (2014): 4749.
Flavell, John H. "Metacognition and cognitive monitoring: A new area of cognitive–developmental inquiry." American psychologist 34.10 (1979): 906.
Remke, Anne, and Mariëlle Stoelinga, eds. Stochastic Model Checking: International Autumn School, ROCKS 2012, Vahrn, Italy, October 2226, 2012. Advanced Lectures. Vol. 8453. Springer, 2014.
Chater, Nick, Joshua B. Tenenbaum, and Alan Yuille. "Probabilistic models of cognition: Conceptual foundations." Trends in cognitive sciences 10.7 (2006): 287291.
Russell, Stuart, and Peter Norvig. "Artificial intelligence: a modern approach." 1995.
Patil, Anand, David Huard, and Christopher J. Fonnesbeck. "PyMC: Bayesian stochastic modelling in Python." Journal of statistical software 35.4 (2010): 1.
BrahimBelhouari, Sofiane, and Amine Bermak. "Gaussian process for nonstationary time series prediction." Computational Statistics & Data Analysis 47.4 (2004): 705712.
Lee, Michael D., and EricJan Wagenmakers. Bayesian cognitive modeling: A practical course. Cambridge University Press, 2014.
Rabiner, Lawrence R. "A tutorial on hidden Markov models and selected applications in speech recognition." Proceedings of the IEEE 77.2 (1989): 257286.
Shegheva 23
Ardö, Håkan, Kalle Åström, and Rikard Berthilsson. "Real time viterbi optimization of hidden markov models for multi target tracking." Motion and Video Computing, 2007. WMVC'07. IEEE Workshop on. IEEE, 2007.
Soukoreff, R. William, and I. Scott MacKenzie. "Metrics for text entry research: an evaluation of MSD and KSPC, and a new unified error metric." Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 2003.
Andreas Mueller, Andreas. Hmmlearn. Python library of algorithms for unsupervised learning and inference of Hidden Markov Models .2014. Print.
Joyner, David A. "Metacognitive Tutoring for InquiryDriven Modeling." (2015).
Box, George EP. "Robustness in the strategy of scientific model building."Robustness in statistics 1 (1979): 201236.
Gelman, Andrew, XiaoLi Meng, and Hal Stern. "Posterior predictive assessment of model fitness via realized discrepancies." Statistica sinica 6.4 (1996): 733760.
Gelman, Andrew, and Cosma Rohilla Shalizi. "Philosophy and the practice of Bayesian statistics." British Journal of Mathematical and Statistical Psychology66.1 (2013): 838.
Greenhill, Brian, Michael D. Ward, and Audrey Sacks. "The separation plot: A new visual method for evaluating the fit of binary models." American Journal of Political Science 55.4 (2011): 9911002.
David, Diana, et al. "JESTER: Joke Entertainment System with TriviaEmitted Responses." (2002).