donders institute for brain, cognition and behaviour - noise … · 2009-03-06 · keywords: brain...
TRANSCRIPT
Noise Tagging as a New Auditory BCI-Paradigm:
a Pilot Study
Author: Joëlle Blankespoor1
Supervisors: dr. J. Farquhar1, prof. dr. ir. P. Desain1, and prof. dr. S. Gielen2
1 Nijmegen Institute for Cognition and Information,
Radboud University, Nijmegen
2 Department of Biophysics,
Radboud University, Nijmegen
October 1st, 2008.
Abstract
Brain Computer Interfaces (BCI) are intended to translate brain activity into computer actions
without intervention of any bodily movements. The main challenge in constructing a BCI
involves reliable signal detection to be used for single trial classification. In this paper, we
focus on developing a BCI using selective auditory attention. Frequency tagging is a way of
watermarking a carrier tone, by means of amplitude modulation, and results in an auditory
steady state response (ASSR). This paper investigates the use of a novel stimulus type for an
auditory BCI paradigm using pseudo-noise codes as amplitude modulators, called noise
tagging. This paper’s main hypothesis is that noise tagging offers a more robust signal
detectability than frequency tagging, which is used as the control condition. Two
experimental conditions were used: perceptual and attentional. The experimental results
confirmed that the noise tagged stimuli could be extracted successfully from the EEG signal
on a single trial basis in the perceptual condition, with averaged performances of 79 %. In the
parallel attentional condition the classification results were not consistent across subjects, but
comparable to the results of frequency tagging; highest classification results were around
70 %. Some suggestions for improvement will be discussed.
Keywords: Brain Computer Interface (BCI), noise tagging, frequency tagging, auditory steady
state response (ASSR), selective auditory attention.
1 Introduction
1.1 Controlling the world with thoughts: fantasy or reality?
Imagine that a few years ago you were paralyzed in a car accident .You can no longer walk or
move your arms, but you can still think. This might not seem like much, but it is enough to
perform a wide variety of physical tasks, if your brain activity is detected and interpreted by a
computer connected to a robot. Although it seems futuristic to think that it is possible to
control devices with brain activity, current research suggests that in the future this might
become possible. Even today, a modest form of controlling computers with brain activity is
possible. The so-called Brain Computer Interfaces (BCI) work from the principle that
recorded brain activity can be translated directly into computer actions without interference of
any bodily movements. A BCI system can be used, in very severe cases, to recover the
connection to the outside world of a motor-disabled person, caused by neuromuscular disorder
like amyotrophic lateral sclerosis (ALS), brainstem stroke or spinal cord injury (Wolpaw et
al., 2002; Lebedev et al., 2006). Other possible applications of BCI systems can be found in
for example: measuring vigilance during car driving (to prevent the driver from falling
asleep), playing a role in a gaming environment (Lalor et al., 2005), using neurofeedback to
treat children with Attention-Deficit/Hyperactivity Disorders (ADHD) (Strehl et al., 2006).
In the current experiment I will investigate the neuronal response to a noisy auditory
stimulus, as distinguished from eliciting a neuronal response using a steady stimulus. In more
detail, an auditory steady state response (ASSR) in the brain can be elicited when presenting a
subject with an amplitude-modulated tone - that is, when a carrier frequency is modulated by
a lower frequency -, and this ASSR can be recorded using for instance an EEG or MEG
system. This paradigm can be used for a BCI based on selective attention, in which a user
directs his attention selectively to one out of two presented tones, both separated in pitch and
location. Each stimulus has a different frequency tag - a way of watermarking a carrier tone
by means of amplitude modulation - allowing identification of the neural response to the
stimulus in question. From analyzing the EEG-data it is possible to identify to which tone the
attention was directed, because there will be an increase in power of the neuronal signal in the
attended stream (Linden et al., 1987; Tiitinen et al., 1993; Ross et al., 2005; Skosnik et al.,
2007).
For the current experiment, a similar paradigm will be used, but with a different type
of stimulus: frequency tagged stimuli serve as control. Use is made of specially designed
stimulus codes (called Gold codes; Gold, 1967), which contain a broad band of frequencies
and have low correlation between different codes. These pseudo-noise codes are used as
amplitude modulators – this is why this procedure is called noise tagging. The goal of this
project is to determine whether noise tagging has distinct advantages over frequency tagging,
and hence can be used in a BCI.
In the next sections a number of different topics related to the subject of this study will
be explained. In section 1.2 I will describe the current state of the art of BCI development,
and discuss a number of challenges that have to be addressed. Further, I will highlight the
pros and cons of recording brain activity using an EEG system. Section 1.3 deals with the
effects of selective attention on auditory signals in general. In line with this section, the results
of different experiments on the ASSR will be highlighted in section 1.4. This section will also
describe the theory concerning noise tagging in more detail.
1.2 BCI: overview and challenges
A wide variety of BCI methods exists, each with its own pros and cons. There are three
different BCI measurement methods: non-invasive, invasive and an intermediate invasive
technique. These categories will be discussed briefly below, followed by some of the
problems that emerge in using certain recording techniques. The possibilities and limitations
of using EEG in particular, as well as of the different kinds of cognitive tasks that are used to
generate a reliable signal, will then be discussed more thoroughly.
1.2.1 Invasive and non-invasive measurement methods
Invasive methods usually involve single- or multiple cell recordings, in which a number of
electrodes are implanted in the brain. One of the greatest advantages of invasive methods is
the relatively clean signal that is obtained, which is due to the direct contact with the area of
interest; different from EEG, there is no skull between the area of interest and the measuring
device to interfere with the signal. Another advantage of electrodes is its high precision of
implementation in the area of interest. One can for instance record from a population of
neurons, which code for spatial direction (Carmena et al., 2003). Another interesting example
is that of re-establishing the connection via electrodes in the brain to restore mobility, by
stimulating the muscles electrically based on the appropriate kinds of motor-related brain
activity detected by these electrodes (Santucci et al., 2005).
However, invasive techniques have some major downsides, and I will discuss three of
them. As with any kind of surgery, there is a risk of infection when implanting electrodes.
Secondly, a major problem with implementation of electrodes is scar tissue buildup around
the electrodes, which makes them useless after a couple of months, as was found in monkeys
(Weber et al., 2000; Schwartz et al., 2007). Finally, another problem involves the risk of
damage to the implanted region or other areas, which might cause functional impairments.
These are some reasons why invasive methods are only sparingly used in human subjects, but
the results are highly promising for future work.
An intermediate invasive technique that is being researched is ECoG (electrocortico-
graphy). This technique is in fact similar to EEG, but now a pad with electrodes is below the
skull, therefore the signal has a higher spatial resolution and a better signal-to-noise ratio than
with EEG. As with all invasive techniques surgery risks are a concern. However, some of the
disadvantages of implanting electrodes in the brain are less of a problem with ECoG; for
example, with ECoG there is less risk of scar tissue buildup. This technique can be applied in
patients, but is too invasive to be used in healthy test subjects. This is why the vast majority of
BCI experiments uses non-invasive recording techniques.
In non-invasive ways, you can detect brain activity by means of a fMRI, MEG or EEG
system, but for practical reasons the latter is the best technique for BCI patients. An EEG
device is transportable, it does not need a shielded room, or magnets to be cooled, it has a
high temporal resolution and it is not very expensive (Lotte et al., 2007). In this study I will
focus on the use of EEG for a BCI system, and its possibilities and limitations. As noted
above, the most important property of EEG is its high resolution in the time domain. In the
spatial domain EEG is less specific, because of volume conduction, which is the process by
which an electric signal spreads out when it passes through different tissues. That is, EEG
localization techniques are highly complicated because of what is called the inverse problem;
a certain recorded scalp topography can be generated by an infinite number of different source
locations and combinations. Thus different neuronal sources might have the same recorded
scalp distribution of activity.
1.2.2 Mental tasks
Different BCI systems have been developed, varying from systems based on different sensory
modalities combined with selective attention, imagining movements, or systems using more
automatic processes. One of the most widely studied cognitive tasks is motor imagery,
because the associated neuronal signals are relatively easy to detect (Babiloni et al., 2000;
Penny et al., 2000; Pfurtscheller et al., 1993).
A method that differs from using a specific cognitive task is that of the so-called
‘operant conditioning’ approach, in which the mental control skill is acquired subconsciously
through feedback, but the subject learns to deploy the skill voluntarily (Birbaumer et al.,
1999; Wolpaw., 1998). The subject may think about anything during the task as long as the
result is control of the cursor on the screen. In a way, this method is comparable to acquiring a
new skill, such as learning to ride a bicycle, where you learn to control the bicycle despite not
being aware of the exact nature of the neural activity or equilibrium dynamics involved in the
process.
Another possible BCI signal is the P300, which is a positive deflection (200-700 ms
after stimulus onset) as response to a rare stimulus surrounded by other random stimulus
events. The P300 can be successfully used to select items through a matrix speller to
communicate spontaneously (Nijboer et al., 2008a). Finally, using selfregulation of slow
cortical potentials (SCP) can also be used as a basis for a BCI, by giving the subject feedback
he can learn to voluntarily control the SCP (which can last several seconds) (Pham et al.,
2005).
When developing a BCI system a number of methodological issues need attention,
which are: 1) how to extract the right cognitive processes; 2) the amount of training time; 3)
the amount of data transfer; 4) how to deal with subject variability.
Firstly, the brain is never quiet; the challenge is to extract the features of interest while
ignoring other processes that are taking place. Different experiments have shown the
difficulty to exactly pinpoint the cognitive process taking place. Moreover, there might be
intermediate cognitive processes that cause changes in the recorded EEG signal, like
concentration, attention and difficulty of the task (Curran et al., 2003).
The second concern in developing BCI system is the amount of training time
necessary to let the subject acquire control over their EEG output: some research groups
reported that fairly extensive training (10-40 sessions) was required to improve accuracy of
EEG control (Curran et al., 2003).
Thirdly, the amount of signal (trial length) needed to reliably classify the right class is
another point of importance. In constructing a BCI we need to maximize both the number of
decisions per minute, and the signal reliability (successful classifications).
The last aspect is the effectiveness of different cognitive tasks as reliable sources of
signals in different subjects. A variability of fitness for specific tasks between individuals
makes it almost impossible to come up with a one-task-fits-all paradigm. This is even more
problematic for people with specific disabilities. Tasks involving motor imagery might be
difficult to execute for a person who has been paralyzed for several years (Pfurtscheller et al.,
2000), and a visual task will be nearly impossible for someone who is visually impaired.
These limitations indicate the need to build a number of different BCI systems with a broad
range of reliable tasks (Curran et al., 2003).
One set of these tasks involves attending selectively to sounds. The auditory modality,
which we focus on in this paper, is well-suited to be used in a BCI system, for instance
because it is a phenomenologically prominent modality and it is one of the last remaining
modalities for ALS-patients. In the next section I will start the exploration of this topic with a
general description of auditory neuronal signals, as well as the influence of selective attention.
1.3 Attending to sounds
At any given moment a person might hear different sounds from different locations. In such a
natural environment, multiple sources confront the auditory system with different types of
information. Among the modalities present, it is important to selectively attend to one of the
sources in order to evaluate and process relevant stimuli. The process of separating a relevant
stream in the auditory modality from its surroundings is called auditory scene analysis (ASA;
Bregman, 1990). An often-used example of the separating process is the so-called cocktail-
party effect, in which a person in a crowded room wants to attend to the conversation in
which he is involved while ignoring other conversations taking place. Selective attention
serves as a vital mechanism to allow a subject to process relevant information. Yet, attention
is a process that is difficult to quantify in a straightforward manner because it never involves a
simple on or off structure; rather, it is a process involving many simultaneous intermediate
values. This notion is important when using a task design involving attention as in the current
experiment.
Studies focusing on the mechanisms of selective attention in the brain, using
functional imaging techniques, found activity in primary and secondary auditory cortices
(Alho et al., 1999; Jancke et al., 1999; Crady et al., 2000; Sevostrionov et al., 2002). Other
known effects of attentional modulation are a correlation between increasing neuronal
magnitude and increasing task difficulty (Spritzer et al., 1988; Boudreau et al., 2006),
although this correlation can be influenced by task design and behavioral strategies.
Although localization studies are useful in terms of knowing where in the brain
auditory processing takes place, functional imaging techniques are not precise enough to
uncover the millisecond time course in which auditory processing takes place. A large number
of studies focusing on auditory attentional processes used MEG/EEG to record information on
a fine-grained timescale. An often used signal is the transient event related potential (ERP),
using brief stimuli. Often studies effects the brain response for change detection, such as the
N1 (~ 100 ms) and the later (~100-200 ms) mismatch negativity (MMN). The MMN is an
example of an ERP, which is elicited when selective attention is directed to a target stimulus
surrounded by distractors, also referred to as the ‘oddball paradigm’. The MMN has received
a lot of attention because of its possible underlying basis as a ‘novelty detector’. This serves
to encode stimulus deviations, which can capture attention and can result in an behavioral
response (Fritz et al., 2007). These ERP components have used relatively short stimuli (for
instance tone pips), and have a long interstimulus interval (Skosnik et al., 2007).
As will be discussed in the next section, an alternative technique to examine
attentional processes is the generation of the auditory steady state response, for this one can
use continuous stimulus modulation. The latter aspect is important for BCI applications,
because the stimulus can be presented for multiple modulation periods, increasing the signal
strength.
1.4 Towards Noise Tagging
In the previous sections I discussed a few auditory signals, selective attention, different
cognitive tasks for BCI systems and the problems that emerge when developing a BCI system.
In this last introductory section I will discuss several aspects of frequency tagging and noise
tagging more thoroughly.
Frequency tagging utilizes the auditory steady state response (ASSR): the neuronal
response to a periodic auditory stimulation such as click trains, amplitude or frequency
modulations (AM or FM) of continuous sounds (Galambos et al., 1981; Hari et al., 1989; John
et al., 2003). The ASSR provides interesting means for investigating topics related to the
assessment of hearing thresholds (Galambos et al., 1981; Stapells et al., 1984), developmental
research (Maurizi et al., 1990; Boettcher et al., 2002; Rojas et al., 2006), levels of
consciousness (Pockett et al., 2002; Plourde et al., 2008), and selective attention (Linden et al,
1987; Tiitinen et al, 1993; Ross et al., 2005; Skosnik et al, 2007). Possible generators of the
ASSR are the primary auditory cortex in Heschl’s gyrus and the thalamus (Ross et al., 2004;
Skosnik et al., 2007).
The steady state response is not bound exclusively to the auditory domain: the so-
called steady-state visual evoked potentials (SSVEP) can be recorded when presenting a
constantly oscillating visual stimulus. The SSVEP is also prone to (c)overt selective attention
(covert means that the stimulus is not the foveal center), and hence might be useful as a BCI
(Müller et al., 1998; Allison et al., 2008). Another useful signal for a BCI is the steady state
somatosensory evoked potential (SSSEP), elecited using vibratory stimulation on for instance
a finger tip (Müller-Putz et al., 2006).
The precise mechanisms that give rise to the neuronal steady state response are a topic
of debate. Synchronisation (resonance) is a candidate for the explanation of the generation of
the SSR (Tanaka et al., 2008), especially in the gamma-band (around 40 Hz). This suggests
that neuronal units have an intrinsic firing rate that best resonates with stimulus frequencies
around 40 Hz. Moreover, experimental findings suggest that the response to a 40 Hz
modulated stimulus is two to three times larger in power compared to higher frequencies, of
70 Hz and above (Herdman et al., 2002; Petitot et al., 2005).
The term “noise” can have several different meanings. In general, noise refers to a
broadband and unpredictable signal. I will discuss three different ways in which noise can
play a role in neuronal systems: 1) noise as interfering with signal encoding; 2) noise as a
neuronal signal enhancer; 3) noise as a stimulus.
Firstly, recall that the purpose of this experiment is to test the effectiveness of noise
tagging instead of frequency tagging for a BCI, and this might be surprising to some. After
all, much of the activity in the brain is spontaneous, and might have nothing to do with actual
processing of environmental stimuli or motor actions that we are interested in for a particular
BCI task, hence this activity is often referred to as noise. In most studies, noise is considered
to disrupt and interfere with the encoding of relevant stimuli. Discussions in the literature
focus mostly on how reliable neuronal processing can compensate for neuronal noise.
However, evidence exists that reliable neural coding can make use of noise, instead of
being forced to compensate for it – this is the second noise type. Though it seems
counterintuitive, the addition of noise can actually lead to a larger response to a stimulus
(Shechter et al., 2006). When recording intracellularily, a current injection into a neuron will
cause a spike train. If a spike train is evoked by injecting a steady-state current the first
following action potentials are similar across multiple trials. But, the later action potentials
will have a greater variability across trials (Bryant et al, 1976; Mainen et al, 1995). However,
when using the same method, but now with ‘frozen noise’ input (frozen refering to using the
same noise from a random distribution) one can cause reliable spiking (Bryant et al., 1976;
Mainen et al., 1995; Galán et al., 2008; Ermentrout et al., 2008).
Not only does noise enhance neuronal firing, as mentioned above, it also increases the
power of the phase synchrony of the ASSR as Tanaka and colleagues (2008) found. They
recorded subjects using a MEG system when AM tones were presented with white noise of
various intensities. A possible explanation of this phenomenon is stochastic resonance, which
refers to a nonlinear system showing a decrease in noise to signal ratio (N/S) when there is an
increase of the noise level in the input.
Thirdly, taking the idea of noise enhancing neuronal firing one step further, one might
argue that neuronal processing of a noise stimulus itself is also possible. The basic idea
behind noise tagging is that of spreading the signal’s power over a broad band of frequencies,
and this has several advantages. For instance, signal detectability will decrease little when
another source masks a particular frequency band. This advantage is important because signal
detection robustness is an important reliability aspect in BCI systems. By spreading the power
over multiple frequency bands the chances are maximized that some signal always remains to
be detected. Another aspect of this approach is that intersubjective variations are less
problematic: if the individual optimal frequency ranges vary, it is likely that each of these
ranges falls within the stimulus frequency range.
Frequency tagging stimuli have short repetion periods, increasing the risk of short
neuronal lags becoming difficult to distinguish from longer latencies, referred to as aliasing.
Temporal aliasing is unlikely to occur, because in noise tagging the stimulus period (in our
experiment ~2 sec) is much longer than the expected neuronal lags. In fact, the noise
modulator can be designed to last much longer.
Another important notion of these noise codes is that there is little crosscorrelation
between different codes. This property will make it easier for correlation analysis methods to
distinguish which stimulus code has been presented or attended. Moreover, autocorrelation of
one code, that is correlating the code with a timeshifted version of itself, is low except at time
lag zero. This latter aspect is for instance important when the brains neuronal response has
several different time lags.
The concept of using noise stimuli is inspired by the “spread spectrum” techniques.
Faced with the problem that wireless signals needed to be secured from “being disturbed,
intercepted, or interfered with in any way” Nikola Tesla came up with the “frequency-
hopping spread spectrum” idea in 1900. This works by rapidly switching a carrier among may
frequency channels. Another application is the “direct-sequence spread spectrum” approach.
A normal narrow band signal is spread over a broad band of frequencies by multiplying it
with a “noise” signal. This noise signal is generated using pre-designed Golden codes. These
are generated using two pseudorandom number sequences, a set of predefined numbers, but
approaching all properties of being random (Gold, 1967; Hershey, 1982; Dixon, 1994).
Today, spread spectrum techniques are used in various wireless communication systems. If
the brain is conceived of as a signal processing device, these tried and true techniques might
also be useful as stimulus retrieval from neuronal signals.
1.6 Hypothesis
The challenge of extracting the noise codes back from the neuronal signal is a main goal of
the currently conducted experiment. To achieve this goal a new stimulus type is designed in
order to develop a more optimal BCI system based on selective auditory attention. The
hypothesis is that noise tagging offers significant advantages over frequency tagging as a way
to increase neuronal signal detectability in BCI applications.
2 Methods and procedure
2.1 Subjects
For the current experiment five subjects were asked to participate in the experiment. All
subjects were healthy adults (2 women, 3 men; aged between 22 and 35). All subjects were
free of neurological disorders, had normal or corrected-to-normal vision and did not report
any hearing disabilities. Only three subjects had some prior experience with EEG recordings.
Subjects were informed about the global content of the study in advance.
2.2.1 Data acquisition
EEG recordings were acquired in an electromagnetically shielded and sound-attenuated cabin,
using 256 sintered Ag/AgCl active electrodes, amplified with a BioSemi ActiveTwo AD-box.
During electrode placement offset jitter and offset amplitudes values were kept below 0.2 mV
and 35 mV respectively. ActiView was used for data acquisition with a sampling rate of 2048
Hz. Data were stored for offline analysis and no further filtering was used at this stage.
2.2.2 Experimental setup
Subjects were comfortably seated in a chair while watching a computer screen. Sounds were
presented on two loudspeakers in the right and left corner of the recording room. The sounds
were presented at a comfortable listening level for each subject (estimated at 70 dB SPL
(sound pressure level)). The grounding electrodes were placed on the forehead in order to
prevent formation of gel bridges (gel contact between two electrodes causing a short-circuit)
when a more central place on the head would have been chosen. Under the assumption that
eye movements and blinking artefacts would not threaten signal integrity at frequencies of
interest, no electro-oculargraphy (EOG) signals were recorded (see Fatourechi et al. (2007)
for a more extended overview about recording EOG in a BCI).
2.2.3 Auditory stimuli
The stimuli were generated by Matlab programs (version 7.5), with a sampling frequency of
44.1 kHz. The frequencies of both the carrier and the modulation signal were chosen such that
they contained an integer number of cycles in each modulation period. Moreover, the
modulators were matched to fit with the down-sampled sampling rate of the EEG recording
system.
For the stimuli two carrier signals at 512 and 768 Hz were used, approaching a saw
tooth function containing six harmonics. These carrier signals were amplitude-modulated with
a depth of 90 % with four different modulation patterns: two for frequency tagging, at 42 2/3
Hz (referred to as 42 Hz in this paper) and 64 Hz, and two for noise tagging at 128 Hz, called
code A and B (see figure 1 for the modulators and their spectra). All stimuli were ~2 sec in
duration and shaped by up-sampling the chosen bit-patterns and cosine filtering the rising and
falling edges for half of one cycle. The filter depended on the stimulus frequency. Effectively
this means that both frequency tags are similar to sine waves, and for the noise tags a
transition of zeros and ones looks like a sine wave.
Figure 1: Stimulus time-courses and spectra. On the left the time-course of each stimulus modulator is shown and on the right the spectrum of each modulator is plotted (note that only the first 200 ms of each stimulus is plotted).
2.3 Experimental design
EEG recordings lasted 1.5 hours. The experiment was divided into three different blocks: a
perceptual block and a parallel attentional block. The blocks were divided in different
sequences (~1 min), containing per sequence either noise or frequency tagging stimuli. Each
sequence was constructed out of multiple (20) trials/epochs lasting 2 seconds with small
silences of 250 milliseconds between the trials (see figure 2).
During each sequence the subject was instructed to look at a fixation cross on the
screen, to minimize the number of eye movements. Between each sequence the subject was
able to have a short break. The number of trials that were collected in the different
experimental conditions were: 150 trials in the perceptual and 100 trials in the attentional
frequency tagging data, 160 trials for both the perceptual and attentional noise tagging data.
Note that in subject 1, due to experimental error, the perceptual conditions has 110 trials and
behavioral data are not available.
Figure 2: Example of a sequence in the attentional condition. The subject is instructed to attend to the right sounds and count the number of low volume deviants. The deviants on the other side have to be ignored.
In order to minimalize the number of different experimental conditions the carriers
and their modulators were fixed, meaning that the 512 Hz carrier (modulated with code A or
the 42 Hz frequency tag) was fixed on the left side and the 758 Hz carrier with the other
modulators on the right side.
In the perceptual recording block a single auditory stream was presented on both
speakers. A simple counting task was used to make sure that the subject was paying attention;
in more detail the subject was instructed before every sequence to count one stimulus type.
In the parallel attentional block two different auditory streams were presented. The
subject was instructed to count the number of deviants (a lower amplitude stimulus) on the
attended side (right or left speaker), deviants were presented on both sides (probability of
occurrence 0.10). In both the perceptual and attentional block, subjects got feedback on their
performance after each sequence. The feedback was stored for later analysis of the behavioral
performances.
2.4 Data analysis
The main data analysis steps are shown in figure 3 and included: a number of pre-processing
steps (see section 2.4.1), correlation analysis (see section 2.4.2) and classification (see section
2.4.3). Further, the classification outputs were decomposed as is described in section 2.4.4.
Figure 3: Data analysis protocol. In this figure the main steps of the data analysis are shown. Starting with the raw data a number of pre-processing steps were used. The square boxes display the main procedure and optional steps are plotted in the oval boxes.
2.4.1 Pre-processing
The first step in the data analysis was to downsample the originally sampled data from 2048
to 512 Hz. The reason for this being no brain signals were to be expected above 256 (the
Nyquist frequency). After that, the right data segments with a length of 2.5 sec were epoched,
(2 sec stimulus plus an extra 250 ms pre- and post stimulus for analysis). Next, the bad
channels (high amplitude 50 Hz and/or offset jitter) were excluded from further data analysis.
Common average referencing (CAR) was used for re-referencing, this is subtracting the
average over the complete scalp from every single electrode at every point in time.
The last step in the pre-processing was applying a bandpass and a 50 Hz notch to the
data. The bandwidth of the applied bandpass was adjusted towards the most optimal response
for the data gathered in the experiment.
2.4.2 Data analysis procedures
After the pre-processing steps a number of different methods were tested to get the most
optimal performance. In short, these methods included: normalizing the spectrum; whitening
the signal; crosscorrelating the signal with the original stimulus modulators. The effects of
each step were examined by comparing the output of the classifier, which will be explained in
more detail in section 2.4.3.
Using a broadband stimulus all frequencies are approximately equally distributed. It
was found that the “1 / f” power distribution of the EEG signal led to dominance of the lower
frequencies. Therefore, the averaged spectrum was calculated for every subject, and used to
normalize the 1 / f effect of the EEG spectrum.
Further, the use of a whitening function on the data was investigated, because such a
function gives all features in the data equal power, which could make it easier for the
classifier to pick the relevant features.
The next step in the analysis was the crosscorrelation computation, or convolution, of
every trial with its original modulator, as shown in figure 4. This works by sliding the original
stimulus modulator over the corresponding EEG epoch and at each timepoint computing the
correlation. Note, that the assumption of noise codes only having an autocorrelation at time
lag zero, and that there is little crosscorrelation between codes are essential.
Figure 4: schematic overview of cross-correlation analysis. This schematic overview illustrates how the correlation analysis was done. When presenting an auditory stimulus to the subject, EEG data was recorded and sliced out 250 ms before stimulus onset. After the pre-processing of the data the correlation was computed. As shown here, the original stimulus modulator was slid over the EEG signal for 0.5 seconds, and at each time-point a correlation value was computed. In the bottom box the resulting correlation values are plotted, with a peak after stimulus onset, indicated with the dotted line.
2.4.3 Classification
The final step was to classify the crosscorrelation values with a linear logistic regression
classifier. A linear classifier groups items with similar features into groups, and bases its
decision on a linear combination of the features. This process can be described as splitting the
high dimensional input space with a hyperplane, for a two-class problem leaving points at
both sides of the plane to be classified as belonging to one class.
The performance of the classifier was assessed using a 30-fold cross-validation, where
the data are split into 30 equally-sized different ‘folds’. A different classifier was trained for
each fold as the testing set, the other 29 are used as the training set. The resulting cross-
validation estimates are a reliable but sometimes underestimated result of the dataset, because
the training is based on a subsample of the available data (Lalor er al., 2005; Bishop, 2008).
Since the trial length is relatively short, combining a few trials is an option to improve
classification accuracy. Moreover, it gives more information about relative strength of the
signal extracted. There are three ways in which one could combine multiple trials. The first is
averaging the EEG of multiple subsequent trials which increases the sensitivity for evoked
oscillations. The second is to combine multiple trials by concatenating them, which might
lead to an improved frequency resolution because the signal length increases. The last option
is to combine the single trial output of the classifier, which increases the probability that a
trial belongs to one class. According to Kallenberg (2007) all three methods seem to give
comparable classification improvements. It depends on the stimulus type which of these
methods performs best; in this experiment all stimuli are time-locked, therefore the first
method can be used.
2.4.4 Weight vector decomposition
The output of the classifier of weight vectors is further decomposed using singular value
decomposition, which is the process of decomposing the vector into several linearly
independent vectors. After this the strongest weights for each different stimulus dataset can be
used to generate a plot with the distribution over the scalp. This would make it possible to
infer if the responses to the different stimuli have the same spatial distribution. Moreover, one
might be able to infer if there is more than one source which is generating the neuronal
response. For more precise source localization one needs other methods like beamforming,
but these go beyond the scope of the currently conducted experiment.
3 Results
3.1 Perceptual data
On both the frequency and noise tagging datasets the effectiveness of the data analysis
methods, described in the methodological section, were tested. These different steps were the
main part of the standard data analysis protocol and included: normalizing the 1/f-effect,
bandpassing with various bandwidths and a correlation analysis.
Classification performance throughout this text, is expressed as an averaged
percentage of correct classifications followed by, between brackets, the standard error (SE)
between subjects. For the frequency tagging data a number of analysis steps improved
classification performance significantly. Classification performance using the raw EEG time-
series was poor, 56 (3), when applying a bandpass between 30 Hz and 80 Hz classification
rates improved to 71 (8); this was most substantial in three of the subjects. Applying different
bandwidths did not further improve performance. Nor did normalizing of the data seem to
improve performance; in fact it slightly worsened the performance to 66 (5). By correlating
the original stimulus modulator with the data, classification rates improved to 86 (4). As one
would expect, combining the correlation analysis with either a bandpass, 87 (4), or the
normalizing approach, 88 (4), did not improve classification rates significantly (p > 0.1). This
means that only the correlation approach is useful for frequency tagged data, although either
bandpassing or normalizing the data might improve performance in some subjects. For
individual results see table 1.
Frequency tagging: perceptual
% (SE between folds) Mean (SE):
Noise tagging: perceptual % (SE between folds)
Mean (SE):
Subject 1 2 3 4 5 1 2 3 4 5 1trial 98(1) 74(2) 87(2) 86(2) 93(1) 88(4) 94(2) 58(2) 69(2) 66(2) 78(3) 73(6) 2 trials 99(1) 76(3) 94(2) 89(3) 96(2) 91(4) 99(1) 62(3) 74(3) 70(4) 88(2) 79(7) 1 trial 97(1) 76(3) 76(2) 81(3) 89(2) 84(4) 84(2) 64(2) 55(2) 55(3) 65(4) 65(5) 2 trials 100 83(4) 83(3) 90(2) 93(2) 90(3) 92(2) 74(2) 60(3) 56(4) 75(3) 71(6)
Table 1: classification performances of the perceptual condition. The first two rows contain classification rates obtained using the standard data analysis protocol (bandpassing, norm-filtering and correlation analysis). The last two rows are the results obtained after whitening of the data. For each subject the averaged percentages for a 30-fold crossvalidation are given, and between brackets the standard error (SE) between different folds. The mean columns show the averaged percentages over subjects, and between brackets the SE.
The averaged results for frequency tagging data are shown in figure 5. This figure
shows the cross-correlation values between the original stimulus modulators and the two
frequency tagging datasets. In order to visually inspect how strong the correlation values are
of the modulator with the EEG dataset, control plots are shown where the stimulus modulators
are correlated with the other datasets. From this figure it becomes clear that the averaged
correlation values are much stronger than the ones in the control plots. Furthermore, the
averaged correlation value of the 42 Hz trials is higher than the averaged value of the 64 Hz
trials.
Figure 5: averaged correlation values of frequency tagging. The averaged correlation values for subject 1, are plotted over time and across electrodes. The top two plots show the cross-correlation values with the datasets of the 42 Hz stimulus either correlated with the 42 Hz modulator (top left plot), or with the 64 Hz modulator (top right plot). The middle two plots show the correlation values for the 64 Hz stimulus, again correlated with the 42 Hz modulator (left middle plot) or the 64 modulator (right middle plot). In the bottom plots, the time-course averaged over channels is shown. Note that stimulus onset occurs at time point zero.
On the noise tagging dataset the different analysis methods performed differently in
some subjects. Classification performance on the not-normalized data was 60 (5).
Bandpassing the data between 30 and 80 Hz improved the classification performance in four
out of five subjects, on average 68 (7). Only one subject 2 had a better classification
performance when the data were not bandpassed (70 versus 54 percent). All other
bandpassing widths that were tested did not lead to any improvements of the results. There is
an indication that correlating the data with the original stimulus modulators slightly improves
classification performance, on average 71 (7), this averaged percentage can be increased with
4 percent when excluding subject 2. Further, normalizing the data also improves classification
performance a little, 73 (6), and again this averaged percentage can be increased to 77 percent
when excluding subject 2.
In figure 6 the averaged results for noise tagging are shown. These have a peak
correlation value around 50 ms after stimulus onset. In addition, this figure shows that both
codes do not correlate with the other code datasets.
Figure 6: averaged correlation values of noise tagging. The averaged correlation values for subject 1, are plotted over time and across electrodes. The above top two plots show the cross-correlation values with the datasets of the A code either correlated with the A code modulator (top left plot), or with the B code modulator (top right plot). The middle two plots show the correlation values for the B code, again correlated with the A code modulator (left middle plot) or the B code modulator (right middle plot). In the bottom plots, the time-course averaged over channels is shown. Note that stimulus onset occurs at time point zero, and neuronal delays are ~50 ms.
After the standard analysis protocol, whitening (as explained in section 2.4.2, this
gives all features equal power) was applied on the data, which did give some contradictory
results. For frequency tagging the whitening did on average not improve classification rates
significantly, 84 (4), (F=3.05, p>0.05), it was also inconsistent across subjects (see table 1),
slightly decreasing with 4 percent on average.
For the noise tagging data it did not make any significant change, 65 (5), (F=5.2,
p>0.05), decreasing in four subjects and only increasing performance in subject 2.
Combining two trials by averaging them prior to the correlation analysis did improve
the frequency tagging data significantly with 3 percent (F=9.8, p<0.05). For the noise tagging
the classification performance was 6 percent higher on average (F=24.9, p<0.01).
3.2 Attentional data
The attentional datasets were analyzed in the same way as the perceptual datasets. For the
frequency tagging dataset classification performance on average was 60 (2), but only two
subjects (1 and 2) were classified on or above 60 (67 percent and 60 percent respectively).
The three other subjects were classified slightly above chance level (58, 56 and 57 percent).
For noise tagging classification performance averaged 54 (1) percent. Only subject 1 was
classified significantly above chance level (57).
Frequency tagging: attentional % (SE between folds)
Mean (SE):
Noise tagging: attentional % (SE between folds)
Mean (SE):
Subject 1 2 3 4 5 1 2 3 4 5
1trial 67(3)** 60(4)* 58(3)* 56(3) 57(4) 60(2) 57(2)* 55(3) 53(3) 49(3) 55(3) 54(1) 2 trials 70(5)** 68(5)** 65(5)** 57(4) 57(7) 63(3) 57(4) 51(4) 54(3) 54(5) 55(4) 54(1) 3 trials 61(5)* 51(5) 56(5) 48(5) 62(5)* 56(3) 1 trial 61(4)* 71(3)** 53(4) 53(4) 55(3) 59(3) 64(3)** 57(3)* 54(3) 52(2) 61(2)** 58(2) 2 trials 70(6)** 76(5)** 51(5) 52(5) 58(5) 61(5) 67(4)** 57(4) 51(3) 56(4) 67(4)** 60(3) 3 trials 73(4)** 53(5) 54(6) 57(4) 71(5)** 62(4)
Table 2: classification performance of the attentional condition. As in table 1 the top rows are the classification rates obtained using the standard data analysis protocol. The last three rows show the results obtained including whitening in the data analysis. Note that for the frequency tagged data combining three trials was not possible because the dataset contained too few trials (see section 2.3). Significance levels are indicated as follows: ‘*’ refers to (p < 0.05) and ‘**’ refers to (p < 0.01).
In order to increase classification performance on both datasets, whitening of the data
and combining multiple trials were further exploited.
Combining multiple trials was done by averaging the EEG of two or three subsequent
trials. For the frequency tagging data a 20-fold crossvalidation was done because this dataset
consisted of too few trials for a 30-fold classification, and only a combination of two trials
was possible because of the smaller number of trials in this dataset. For this dataset combining
multiple trials did on average not improve classification performance, two combined trials
were classified on average at 63(3). However, this number is influenced by the two last
subjects (4 and 5) who were classified near chance level. Combining two trials did improve
classification for the first three subjects, to 70, 68 and 65 respectively, on average for these
subjects an improvement of 6 percent (see table 2).
For noise tagging combining three subsequent trials before classification, did not
improve classification rates on average, but two subjects (1 and 5) were now classified above
60 percent (61 and 62). The remaining three subjects were still classified on chance level.
Whitening of the frequency tagged data gave only a classification performance
improvement for subject 2, from 60 to 71 %. In three subjects it did not change classification
performance, but it did drop performance to chance level in subject 3. For the noise tagging
data, whitening did improve classification rates in the two subjects (1 and 5), that had
classification levels above chance level with the standard analysis methods, to 64 and 61 %,
respectively. Combining three subsequent trials improved the classification results further to
73 and 71 %, respectively.
3.3 Behavioral data
The feedback on each sequence in both experimental conditions is summarized in table 3. For
each sequence there were four answer possibilities, which means that if a subject gave a
random answer, performance would be 25 percent. All subjects performed quite well during
the perceptual condition. In the attentional condition two subjects (3 and 4) had lower
feedback scores compared to the other two subjects, especially for the frequency tagging
sequences.
Subject: 1 2 3 4 5 Mean Noise tagging - 71 92 63 96 81 Frequency tagging - 81 88 81 100 88
Perceptual condition Mean - 75 90 70 98 83
Noise tagging - 88 88 63 100 85 Frequency tagging - 90 60 40 90 70
Attentional condition
Mean - 88 77 54 96 79
Table 3: behavioral performance in both experimental conditions. For each subject the averaged feedback scores (expressed in percentage correct) on each sequence are shown. Due to experimental error no feedback scores are available for subject 1.
Figure 7: averaged weight perceptual condition. In the left top plot the strongest weight for code A is plotted, the right top plot shows the strongest weight for code B. The plots show a clear front to back distribution. The middle plot shows the timecourse of the strongest weight, for noise tagging averaged across subjects. On the left middle plot, the strongest weight for the 42 Hz stimulus is shown, on the right the strongest weight for the 64 Hz stimulus is plotted. The bottom plot shows the timecourse of the strongest weight, for frequency tagging. Note that the intensity of the colors reflects how strong the weight is, - the absolute value is important.
3.4 Weight decomposition
On both perceptual datasets the classifier weights were decomposed, as described in section
2.4.4, to see how the correlation values were distributed over the scalp. In figure 7 the
averaged strongest weight over all subjects is plotted for the perceptual condition. For both
noise codes a clear front to back distribution becomes visible. There is an indication that a
second, less powerful, source is present in this data, but the results are not consistent across
subjects (see appendix A for more details).
Figure 8: averaged weight for attentional condition. The strongest weight for all stimuli for subject 1 are plotted. See figure 7 for layout details.
For frequency tagging, the scalp distribution is a bit different than for the noise
tagging, having a more central to back distribution. There is no clear indication that a second
source is present in the data (see appendix B).
The strongest weight of the attentional condition is shown in figure 8. Because only
subject 1 showed attentional modulation in both conditions, this subject was used for to
generate these plots.
4 Discussion
In the current experiment a new type of stimulus modulator was introduced, called noise
tagging. It was hypothesized to have distinct advantages over frequency tagging. The main
advantage was that of the spreading of power over multiple frequency bands, making signal
detection more robust to interfering signals in some frequency bands, an important feature in
BCI applications. Because no one has yet investigated a paradigm using noise stimuli in
humans, the main challenge was whether the noise codes could be extracted from the neuronal
signal. If this would be possible it was further hypothesized that noise tagging could serve as a
new BCI paradigm based on selective attention. Frequency tagged stimuli were used as a
control. The experimental results confirmed that the noise tagged stimuli could be extracted
successfully from the EEG signal on a single trial basis. Using selective attention to one out of
two auditory streams, it was investigated if noise tagged stimuli could be used for a BCI. The
experimental findings were not consistent across subjects: only two subjects were classified
above chance level in the attentional condition. Nonetheless, the results are comparable to
those of frequency tagging, where three subjects were classified successfully.
In this section the results will be discussed more thoroughly. Furthermore, in the next
section the limitations of and possible improvements for noise tagging as a BCI will be
discussed. In the third section a number of limitations of the current experiment and some
recommendations for further research will be discussed.
Although it is difficult to draw any strong conclusions based on the small group of
subjects, a number of interesting issues can still be discussed. If one looks at the difference in
classification performance between frequency tagging and noise tagging in the perceptual
condition, the latter classifies ~15 % worse than the frequency tagging data, even though this
dataset contained less trials. This means that frequency tagging is superior to noise tagging
because it elicits a stronger perceptual response. Moreover, high frequency tagging
performance on the perceptual dataset seems to be a very good predictor for the performance
of the noise tagging data (see figure 9).
Figure 9: Relation between the responses (percentages correct classification) in the perceptual condition. Each data point represents the performance on both frequency and noise tagging of an individual subject.
There is a number of possible explanations for this relation. One is that the signal to
noise ratio is rather constant, being reflected in the amount of reliable signal that can be
extracted from that subject. Another possibility is that the same circuits underlie the
generation of the responses to both frequency and noise tagging. However, this latter
explanation might a topic of debate when looking at the attentional datasets performances.
For the attentional condition the superior performance of frequency tagging over noise
tagging seems to diminish. If one takes the best performances, three subjects can be classified
at 68 % on average for the frequency tagged data, that is when combining two trials. For noise
tagging in the attentional condition two subjects can be classified above chance, on average
67 % using two combined trials and whitening. These percentages are relatively close,
suggesting that frequency tagging, which had a better classification performance on the
perceptual data, somehow lacks this advantage in the attentional condition. Given the fact that
drawing any strong conclusions based on a few subjects is not possible, one could only
imagine a number of possible explanations for the found effects.
A possible explanation is that too many subjects were classified at or slightly above
chance level, which makes it impossible to infer what is going on. Another reason might be
that the frequency tagged stimuli are harder to separate perceptually - due to their regularity
the subject has no cues other than the carrier and location by which frequency modulated
stimuli can be distinguished. The noise tagged stimuli seemed to be easier to attend to,
because their patterns made them recognisable. Moreover, in the current experimental design
the modulators were matched to be integers of the carriers, which were therefore half an
octave apart. This combination might have introduced perceptual mixing of both frequency
tagged stimuli, making it harder to separate them. The behavioral data of the attended
condition does not fully support this hypothesis. Subject 5 has high behavioral scores, but low
classification performances, and in subject 3 it is the other way around. In subject 2 and 4
behavioral data can be matched to classification performance.
Other possible explanations might be found in the underlying mechanisms that
generate the response. It might be the case that the frequency tagging data decrease non-
linearly in power due to interaction effects because of interference between the two stimuli.
Frequency tagged stimuli are static, hence more likely to exploit the similar neuronal
mechanisms, therefore more susceptible to interference. This is in contrast with noise tagged
stimuli which have dynamically changing properties. That is, noise tagging has the advantage
that is has a broad band of frequencies and is perhaps exploiting multiple neuronal
mechanisms at the same time.
As described in the introduction, frequency tagging stimuli respond best at frequencies
around 40 Hz. This preference might be explained by resonating neuronal circuits, that are
intrinsically tuned to this frequency range. Recall that in the current experiment the
modulators were at 42 Hz and 64 Hz. Though the first modulator frequency is in the preferred
frequency range, which was indicated by the stronger response in the perceptual condition, the
neuronal response to the 64 Hz stimulus was less strong. Although this reasoning is very
hypothetical, it might be true that the 40 Hz preference causes stronger responses in the
attentional condition. In the following section, the use of noise tagging for a BCI will be
evaluated.
4.2 Noise tagging as a new BCI paradigm?
Recall (from section 1.2.2) that when designing a new BCI system, there are several issues
that need to be addressed: 1) how to extract the right cognitive processes, 2) the amount of
training time, 3) the amount of data transfer, 4) and how to deal with subject variability. In
retrospect, this experiment dealt with these issues in the following ways. To begin with, in
this experiment the stimuli were fixed, and extractable in all subjects in the perceptual
experimental condition. This means, that in the selective attention condition, only selective
attention is under the subject’s direct control. If subjects were capable of attending to the right
stimulus to the exclusion of everything else, classification performance should be perfect.
However, selective attention is subject to a wide variety of contextual influences, among
which concentration, task length and task difficulty (Curran et al., 2003). Even if one would
be able to perfectly attend to the stimuli, different mental strategies might be used in different
circumstances. For instance the subject could attend to the location or specific stimulus
characteristics like pitch. This illustrates some of the problems the classifier is faced with,
making it difficult to classify each trial.
The second problem was that in the current experiment subjects were tested only once,
so the training time was very short. Thus a subject was able to acquire control in the paradigm
used relatively quickly, which is an advantage compared to systems requiring weeks of
training. However, the behavioral data suggested that some subjects had difficulty in the
attentional condition, reflected in the low percentages of correct answers. Therefore, a few
training sessions might help to get the subject familiar with the task, and improve the
classification performance.
Thirdly, the number of succesful classifications is the feature that is probably most
important for a successful BCI system. Classification performance can be translated to an
information transfer rate. According to the formula provided in Wolpaw and colleagues
(1998) the averaged performance in both attentional conditions is comparable to a bitrate of
1.4 bits per minute (bpm).
The final problem a BCI system has to account for is subject variability. As discussed
in the introduction, a universally applicable BCI system can only be constructed by studying
BCI systems in various modalities, to determine which is least vulnerable to subject
variability. As such, a BCI system based on auditory signals is a useful contribution to the
range of BCI systems that are developed. In the current experiment the classification
performance of noise tagging was not superior to frequency tagging performance. Further
experiments on noise tagging, in auditory as well as other modalities, will have to prove its
use for a BCI. Some recommendations for changes in experimental design will be discussed
in the next section.
4.3 Recommendations for further research
Overall, even the classification rates obtained after combining multiple trials are too low to
use in an online BCI. In this section, I will describe what could possibly explain this moderate
classification performance, also a number of possible improvements will be discussed. A
number of reasons could explain the poor average performance of the subjects in the current
experiment. Some subjects reported having difficulty attending to one side all the time.
Possibly the experimental task was not interesting enough to capture the subject’s attention
over the whole sequence. I would argue that when we make the task more interesting for the
subject, the attentional effects in the neuronal response will increase, as mood and motivation
are suggested to play in role in learning to control a BCI (Curran et al., 2003; Nijboer et a.,
2008b). Furthermore, the used attentional paradigm can be argued to be too different from the
natural situation when attention is needed. The natural function of attention, which is to
extract relevant information from the surroundings, is not exploited during the experimental
task. A possible implementation on an experimental level could be to choose a different
carrier (for instance an instrumental voice), or making a sequence meaningful as a whole (for
instance using a combination of increasing and decreasing carrier tones), or giving the subject
online feedback about their performance.
5 Conclusion
The main hypothesis associated with this experiment was that noise tagging would offer
distinct advantages over frequency tagging as part of an experimental BCI system. However,
this experiment demonstrated that the performance of the noise tagging paradigm did not
improve significantly over the performance of the frequency tagging paradigm. Still, this
experiment did show that noise tagging does work and as such offers new opportunities for
researchers in BCI to design a viable system; because this is such novel approach there is still
a lot of room for improvement. For fundamental neuroscience the expansion of the catalog of
stimuli retrieval paradigms is small but important next step in our exploration of the workings
of the (human) brain.
6 Acknowledgements
I would like to thank my supervisors Jason Farquhar, Peter Desain and Stan Gielen for their
support and advice, and Rutger Vlek and Philip van den Broek for their valuable input.
7 References
Alho, K., Medvedev, S.V., Pakhomov, S.V., Roudas, M.S., Tervaniemi, M., Reinikainen, K.,
Zeffiro, T. & Naatanen, R. (1999). Selective tuning of the left and right auditory cortices during spatially directed attention. Brain. Res. Cogn. Brain. Res. 7(3): 335- 341.
Allison, B.Z., McFarland, D.J., Schalk, G., Zheng, S.D., Jackson, M.M. & Wolpaw, J.R.
(2008). Towards an independent brain-computer interface using steady state visual
evoked potentials. Clin. Neurophys., 119: 399-408.
Babiloni, F., Cincotti, F., Lazzarini, L., Millàn, J., Mourino, J., Varsta, M., Heikkonen, J.,
Bianchi, B. & Marciani, M.G. (2000). Linear classification of low-resolution EEG
patterns produced by imagined hand movements. IEEE Trans. Rehabil. Eng., 8:186-
188.
Bailey, B.J. (1984). Cochlear prosthesis implantation: review of the issues. JAMA. 251(24):
3282.
Bieser, A. & Muller-Preuss, P. (1996). Auditory responsive cortex in the squirrel monkey:
neural responses to amplitude-modulated sounds. Exp. Brain Res., 108 (2): 273-284.
Birbaumer, N., Ghanayim, N., Hinterberger, T., Iversen, I., Kotchoubey, B., Kübler, A.,
Perelmouter, J., Taub, E. & Flor, H. (1999). A spelling device for the paralyzed.
Nature, 398:297-298.
Birbaumer, N., Kübler, A., Ghanayim, N., Hinterberger, T., Perelmouter, J., Kaiser, J.,
Iversen, I., Kotchoubey, B., Neumann, N. & Flor, H. (2000). The thought translation
device (TTD) for completely paralyzed patients. IEEE Trans. Rehabil. Eng., 8:190-
192.
Bishop, C.M. (2008). Linear models for classification. Pattern recognition and machine
learning (ch. 4). New York; Springer Science.
Bregman, A.S. (1990). Auditory scene analysis: the perceptual organization of sounds. MIT
press.
Bryant, H.L. & Segundo J.P. (1976). Spike initiation by transmembrane current: a whitenoise
analysis. J. Physiol., 260: 279-314.
Boettcher, F.A., Madhotra, D., Poth, E.A. & Mills, J.H. (2002). The frequency-modulation
following response in young and aged human subjects. Hear. Res., 165(1-2): 10-18.
Boudreau, C.E., Williford, T.H. & Maunsell, J.H. (2006). Effect of task difficulty and target
likelihood in area V4 of macaque monkeys. J. Neurophys., 96: 2377-2387.
Carmena, J.M., Lebedev, M.A., Crist, R.E., O’Doherty, J.E., Santucci, D.M., Dimitrov, D.F.,
Patil, P.G., Henriquez, C.S., Nicolelis, M.A.L. (2003). Learning to control a brain-
machine interface for reaching and grasping by primates. PLoS Biology, 1: 193-208.
Chi, T., Gao, Y., Guyton, M.C., Ru, P. & Shamma, S. (1999). Spectrotemporal modulation
transfer functions and speech intelligibility. J. Acoust. Soc. Am., 106 (5): 2719–2732.
Curran, E.A. & Stokes, M.J. (2003). Learning to control brain activity: A review of the
production and control of EEG components for driving brain-computer interface (BCI)
systems. Brain and Cogn., 51: 326-336.
Dixon, R.C. (1994). Spread Spectrum Systems. Wiley - Interscience, 3rd edition.
Eggermont, J.J. (1994). Temporal modulation transfer functions for am and fm stimuli in cat
auditory cortex: effects of carrier type, modulating waveform and intensity. Hear. Res.,
74 (1-2): 51–66.
Ermentrout, G.B., Galán, R.F. & Urban, N.N. (2008). Reliability, synchrony and noise.
Trends Cogn. Sc., 624: 1-7.
Galambos, R., Makeig, S. & Talmachoff, P.J. (1981). A 40-Hz auditory potential recorded
from the human scalp. Proc. Natl. Acad. Sci. USA, 78(4): 2643-2647.
Grady, C.L., Van Meter, J.W., Maisog, J.M., Pietrini, P., Krasuski, J. & Rauschecker, J.P. (1997). Attention-related modulation of activity in primary and secondary auditory cortex. Neurorep., 8(11): 2511–2516.
Gold, R. (1967). Optimal binary sequences for spread spectrum multiplexing. IEEE
Trans. Infor. Theory, 13(4): 619–621.
Hari, R., Hamalainen, M. & Joutsiniemi, S.L. (1989). Neuromagnetic steady-state responses
to auditory stimuli. J. Acoust. Soc. Am., 86(3):1033-1039.
Herdman, A.T., Lins, O., Van Roon, P., Stapels, D.R., Scherg, M. & T.W., Picton (2002).
Intracerebral sources of human auditory steady-state responses. Brain Topography,
15(2): 69-86.
Hershey, J. (1982). Direct Sequence Spread Spectrum Techniques. Aegean Park Press.
Fatourechi, M., Bashashati, A., Ward, R.K. & Birch, G.E. (2007). EMG and EOF artifacts in
brain computer interface systems: A survey. Clin. Neurophys., 118: 480-494.
Fritz, J.B., Elhilali, M., David, S.V. & Shamma, S.A. (2007). Auditory attention - focussing
the searchlight on sound. Cur. Opinion in Neurobiol., 17: 437-455.
Galán, R.F., Ermentrout, B.G., & Urban, N.N. (2008). Optimal time scale for spike-time
reliability: Theory, simulations and experiments. J. Neurophys., 99: 277-283.
Jancke, L., Mirzazade, S. & Shah, N.J. (1999). Attention modulates activity in the primary and the secondary auditory cortex: a functional magnetic resonance imaging study in human subjects. Neurosci. Lett., 266 (2): 125-128.
John, P.X., Dimitrijevic, A. & Picton, T.W. (2003). Efficient stimuli for evoking auditory
steady-state responses. Ear Hear., 24(5): 406-423.
Kallenberg, M. (2007). Auditory selective attiontion as a method for brain computer interface.
Nijmegen CNS Web Edition, 2.
Lalor, E.C., Kelly, S.P., Finucane, C., Burke, R., Smith, R., Reilly, R.B. & McDarby, G.
(2005). Steady-state VEP-based Brain-Computer Interface control in an immersive 3D
gaming enveronment. J. Appl. Sign. Process., 19: 3156-3164.
Lebedev, M.A. & Nicolelis, M.A. (2006). Brain- machine interfaces: past, present and future.
Trends Neurosci. 29(9): 536-46.
Liang, L., Lu, T. & Wang, X. (2002). Neural representations of sinusoidal amplitude and
frequency modulations in the primary auditory cortex of awake primates. J.
Neurophys., 87 (5): 2237–2261.
Linden, R.D., Picton, T.W., Hamel, G. & Campbell, K.B. (1987. )Human auditory steady -state evoked potentials during selective attention. Electroencep. Clin. Neurophys., 66(2): 145–159.
Lotte, F., Congedo, M., L´ecuyer, A., L’amarche, F. & Arnald, B. (2007). A review of
classification algorithms for EEG-based brain–computer interfaces. J. Neural Eng., 4:
1-13.
Mainen, Z.F. & Sejnowski, T.J. (1995). Reliability of spike timing in neocortical neurons.
Science, 268: 1503-1506.
Mauritzi, M., Almadori, G., Paludetti, G., Ottaviani, F., Rosignoli, M. & Luciano, R. (1990).
40-Hz steady-state responses in newborns and in children. Audiology, 29(6):322-328.
Müller, M.M., Picton, T.W., Valdes-Sosa, P., Riera, J., Teder-Salejarvi, W.A. & Hillyard, S.A.
(1998). Effects of spatial selective attention on the steady-state visual evoked potential
in the 20–28 Hz range. Cogn. Brain Res., 6(4): 249-261.
Müller-Putz, G.R., Scherer, R., Neuper, C. & Pfurfscheller, G. (2006). Steady-state
somatosensory potentials: suitable signals for brain-computer interfaces? IEEE Trans.
Rehabil. Eng., 14(1): 30-37.
Nijboer, F., Seller, E.W., Mellinger, J., Jordan, M.A., Matuz, T., Furdea, A., Halder, S.,
Mochty, U., Krusienski, D.J., Vaughan, T.M., Wolpaw, J.R., Birbaumer, N. & Kübler,
A. (2008a). A P300-based brain-computer interface for people with amyotrophic
lateral sclerosis. Clin. Neurophys., 119: 1909-1916.
Nijboer, F., Furdea, A., Gunst, I., Mellinger, J., MvFarland, D.J., Birbaumer, N. & Kübler, A.
(2008b). An auditory brain computer interface (BCI). J. of Neurosc. Methods, 167(1):
43-50.
Penny, W.D., Roberts, S.J., Curran, E.A. & Stokes, M.J. (2000). EEG-based communication:
a pattern recognition approach. IEEE Trans. Rehabil. Eng., 8:214-215.
Petitot, C., Collt, L. & Durrant, J.D. (2005). Auditory steady-state responses (ASSR): effects
of modulation and carrier frequencies. Intern. J. Audio., 44(10): 567-573.
Pfurtscheller, G., Flotzinger, D. & Kalcher, J. (1993).Brain-computer interface - a new
communication device for handicapped persons. J. Microcomp. Appl., 16:293-299.
Pfurtscheller, G., Neuper, N., Guger, C., Harkam, W., Ramoser, H., Schlögl, A., Obermaier,
B. & Pregenzer, M. (2000). Current trends in Graz Brain-Computer Interface (BCI)
research. IEEE Trans. Rehabil. Eng., 8:216-219.
Pham, M., Hinterberger, T., Neumann, N., Kübler, A., Hofmayer, N., Grether, A., Wilhelm,
B., Vatine, J.J. & Birbaumer, N. (2005). An auditory brain-computer interface based
on the self-regulation of slow cortical potentials. Neurorehabil. Neural Repair, 19:
206-218.
Ross, B., Picton, T.W. Herdman, A.T. & Pantev, C. (2004).The effect of attention on the
auditory steady-state response. Neurol. Clin. Neurophys., 22: 1-4.
Santucci, D.M., Kralik J.D., Lebedev M.A. & Nicolelis M.A. (2005). Frontal and parietal
cortical ensembles predict single-trial muscle activity during reaching movements in
primates. Eur. J. Neurosci., 22(6): 1529-1540
Schreiner, C.E. & Urbas, J.V. (1986). Representation of amplitude modulation in the auditory
cortex of the cat. I. The anterior auditory field (AAF). Hear. Res., 21 (3): 227-241.
Schreiner, C.E. & Urbas, J.V. (1988). Representation of amplitude modulation in the auditory
cortex of the cat. ii. comparison between cortical fields representation of amplitude
modulation in the auditory cortex of the cat. I. The anterior auditory field (AAF).
Hear. Res., 32 (1): 49–63.
Schwartz, A.B. (2007). Useful signals from motor cortex. J Physiol. 579: 581–601. Sevostianov, A., Fromm, S., Nechaev, V., Horwitz, B. & Braun, A. (2002). Effect of
attention on central auditory processing: an fMRI study. Int. J. Neurosci., 112 (5): 587–606.
Skosnik, P.D., Krishnan, G.P. & O’Donnel, B.F. (2007). The effects of selective attention on
the gamma-band auditory steady-state response. Neurosc. Let., 420: 223-228
Spitzer, H., Desimone, R. & Moran, J. (1988). Increased attention enhances both behavioral
and neuronal performance. Science, 240: 338-340.
Stapells, D.R., Linden, D., Suffield, J.B., Hamel, G. & Picton, T.W. (1984). Human auditory
steady state potentials. Ear Hear., 5(2): 105-113.
Strehl, U., Leins, U., Goth, G., Klinger, C., Hinterberger, T. & Birbaumer, N. (2006).
Selfregulation of slow cortical potentials: a new treatment for children with Attention-
Deficit/Hyperactivity Disorder. Trends Cogn. Sci., 3: 151-162
Tanaka, K., Kawakatsu, M. & Nemoto, I. (2008). Stochastic resonance in auditory steady
state responses in a magnetoenphalogram. Clin. Neurophys., 119: 2104-2110.
Thanos, S., Heiduschka, P. & Stupp, T. (2007). Implantable visual prostheses. Acta.
Neurochir. Suppl., 92 (2): 465-472.
Tiitinen, H., Sinkkonen, K., Reinikainen, K., Alho, J., Lavikainen, J. & Näätänen, R. (1993).
Selective attention enhances the auditory 40-Hz transient response in humans. Let. To
Nature, 364: 59-60.
Weber, R.E., Schwartz, A.B. (2000). Work toward real-time control of a cortical neural
prothesis. IEEE Trans. Rehabil. Eng., 8 (2): 196-198
Wolpaw, J.R., Ramoser, H., McFarland, D.J. & Pfurtscheller, G. (1998). EEG-based
communication: improved accuracy by response verification. IEEE Trans. Rehabil.
Eng., 6:326-333.
Wolpaw, J.R, Birbaumer, N., McFarland, D.J., Pfurtscheller, G. & Vaughan, T.M. (2002).
Brain-computer interfaces for communication and control. Clin. Neurophys., 113:767-
791.
Appendix A:
Figure 10: weight decompostion for individual subjects for code A. In this figure the four strongest weights (from left to right) of code A are plotted for each subject (top to bottom). The first weight is consistent across subjects, as already shown in figure 7. The second weight is less consistent across subjects, this is the same for the B code (not shown).
Appendix B:
Figure 11: weight decompostion for individual subjects for the 42 Hz stimulus. In this figure the four strongest weights (from left to right) of the 42 Hz stimulus are plotted for each subject (top to bottom). The first weight is consistent across subjects, as already shown in figure 8. The second weight is not very consistent across subjects, this is the same for the 64 Hz stimulus (not shown).