multichannel electroencephalographic analyses via dynamic

15
Appl. Statist. (2001) 50, Part 1, pp. 95-109 Multichannel electroencephalographic analyses via dynamic regression models with time-varying lag-lead structure Raquel Prado Universidad Sim6n Bollvar, Caracas, Venezuela and Mike Westand Andrew D. Krystal Duke University, Durham, USA [Received March 1999. Final revision July 2000] Summary. Multiple time series of scalp electrical potential activity are generated routinely in electroencephalographic (EEG) studies. Suchrecordings provide important non-invasive dataabout brain function inhuman neuropsychiatric disorders. Analyses ofEEG traces aimto isolate char- acteristics oftheir spatiotemporal dynamics that may be useful indiagnosis, or may improve the understanding of the underlying neurophysiology or mayimprove treatment through identifying predictors andindicators of clinical outcomes. We discuss the development andapplication ofnon- stationary time series models for multiple EEG series generated from individual subjects in a clinical neuropsychiatric setting. Thesubjects aredepressed patients experiencing generalized tonic-clonic seizures elicited by electroconvulsive therapy (ECT) as antidepressant treatment. Twovarieties of models-dynamic latent factor modelsand dynamic regression models-are introduced and studied. We discuss modelmotivation and form, and aspects of statistical analysis including parameter identifiability, posterior inference and implementation of thesemodels via Markov chain Monte Carlo techniques. Inan application tothe analysis of a typical setof19 EEG series recorded during an ECT seizure atdifferent locations over a patient's scalp, these models reveal time-varying features across the series that arestrongly related tothe placement of the electrodes. We illustrate various model outputs, the exploration of such time-varying spatial structure andits relevance in the ECT study, and inbasic EEG research ingeneral. Keywords: Bayesian inference; Dynamic latent factors; Dynamic linear models; Electroconvulsive therapy; Electroencephalography; Markov chain Monte Carlo methods; Non-stationary time series; Time series decomposition 1. Introduction Multichannel electroencephalographic (EEG) recordings arise from thesimultaneous measure- ment ofelectrical potential fluctuations at a number of sites on thescalp of a humansubject. The analysis of suchdata is at present theleastexpensive and mostwidely available way to study humanbrainfunction effectively and non-invasively. For a useful recent collection of articles on analysing EEG data, see Angeleri et al. (1997). The study of brainfunction through multichannel electroencephalography is particularly promising in studies of brainseizures induced via electroconvulsive therapy (ECT) as partof Address for correspondence: Raquel Prado, Departamentode C6mputo Cientifico y Estadistica, Universidad Sim6nBolivar, Apartado 89000,Caracas, Venezuela. E-mail: [email protected] ? 2001 Royal Statistical Society 0035-9254/01/50095

Upload: others

Post on 18-Oct-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multichannel electroencephalographic analyses via dynamic

Appl. Statist. (2001) 50, Part 1, pp. 95-109

Multichannel electroencephalographic analyses via dynamic regression models with time-varying lag-lead structure

Raquel Prado Universidad Sim6n Bollvar, Caracas, Venezuela

and Mike West and Andrew D. Krystal Duke University, Durham, USA

[Received March 1999. Final revision July 2000]

Summary. Multiple time series of scalp electrical potential activity are generated routinely in electroencephalographic (EEG) studies. Such recordings provide important non-invasive data about brain function in human neuropsychiatric disorders. Analyses of EEG traces aim to isolate char- acteristics of their spatiotemporal dynamics that may be useful in diagnosis, or may improve the understanding of the underlying neurophysiology or may improve treatment through identifying predictors and indicators of clinical outcomes. We discuss the development and application of non- stationary time series models for multiple EEG series generated from individual subjects in a clinical neuropsychiatric setting. The subjects are depressed patients experiencing generalized tonic-clonic seizures elicited by electroconvulsive therapy (ECT) as antidepressant treatment. Two varieties of models-dynamic latent factor models and dynamic regression models-are introduced and studied. We discuss model motivation and form, and aspects of statistical analysis including parameter identifiability, posterior inference and implementation of these models via Markov chain Monte Carlo techniques. In an application to the analysis of a typical set of 19 EEG series recorded during an ECT seizure at different locations over a patient's scalp, these models reveal time-varying features across the series that are strongly related to the placement of the electrodes. We illustrate various model outputs, the exploration of such time-varying spatial structure and its relevance in the ECT study, and in basic EEG research in general.

Keywords: Bayesian inference; Dynamic latent factors; Dynamic linear models; Electroconvulsive therapy; Electroencephalography; Markov chain Monte Carlo methods; Non-stationary time series; Time series decomposition

1. Introduction

Multichannel electroencephalographic (EEG) recordings arise from the simultaneous measure- ment of electrical potential fluctuations at a number of sites on the scalp of a human subject. The analysis of such data is at present the least expensive and most widely available way to study human brain function effectively and non-invasively. For a useful recent collection of articles on analysing EEG data, see Angeleri et al. (1997).

The study of brain function through multichannel electroencephalography is particularly promising in studies of brain seizures induced via electroconvulsive therapy (ECT) as part of

Address for correspondence: Raquel Prado, Departamento de C6mputo Cientifico y Estadistica, Universidad Sim6n Bolivar, Apartado 89000, Caracas, Venezuela. E-mail: [email protected]

? 2001 Royal Statistical Society 0035-9254/01/50095

Page 2: Multichannel electroencephalographic analyses via dynamic

96 R. Prado, M. West and A. D. Krystal

the treatment of depression. ECT involves the electrical induction of a series of generalized tonic-clonic seizures for therapeutic purposes and is the most effective known treatment for patients suffering from major depression (Weiner and Krystal, 1994). However, the mechanisms of action of this treatment remain poorly understood and there is a need to develop markers for the prediction and indication of therapeutic response. Several studies have suggested that the therapeutic effects of ECT may be associated with changes in brain function that are manifest in terms of various EEG features: changes in the patterns of evolution of spectral content over time, changes and differences in the raw amplitude of EEG waveforms recorded at different scalp locations and the relationships between activity between channels (see the review, for example, in Krystal, West, Prado, Greenside, Zoldi and Weiner (1999)). In our previous work, we have demonstrated the usefulness of various classes of dynamic models for the analysis of univariate EEG studies and we have begun exploratory studies of the relationships between multiple traces on a single subject (West et al., 1999; Prado and West, 1997; Krystal et al., 1996). Here we describe a further development and application of more formal multivariate dynamic models.

The EEG data studied here are part of a full data set, code named Ictal 19, that corresponds to records of 19 EEG channels recorded in eight subjects in each of three different states: while awake, with eyes closed, before receiving an electrical stimulus, after the administration of methohexital anaesthesia but before the electrical stimulus and, finally, during an ECT seizure (Krystal et al., 1996; Zoldi et al., 2000). During each induced seizure, 19 parallel series were recorded simultaneously from 19 Ag-AgCl electrodes of the International 10-20 EEG System, located around and over the patient's scalp utilizing a linked ear reference and two additional channels dedicated to detecting eye movement artefacts (Krystal et al., 1996). Here we study 19 EEG traces recorded in one seizure of one of the patients to characterize aspects of the temporal dynamics that aid in understanding the physiology that drives the anti- depressant effectiveness of ECT.

EEG traces typically exhibit highly periodic fluctuations with power across a continuum of frequencies and with time variations in spectral structure. In the neuropsychiatric com- munities, EEG activity is loosely categorized into four main frequency ranges: the low frequency delta band, of roughly 0-4 Hz, the theta band, of roughly 4-8 Hz, the alpha band, of roughly 8-13 Hz, and the beta band, of more than 13 Hz. Often the appearance of activity in a specific band characterizes a particular physiological state (Dyro, 1989).

Fig. l(a) shows a schematic representation of the approximate locations of the 19 elec- trodes over the scalp. Before recording, EEG signals are amplified and filtered with a band pass of 1.6-70 Hz and are then stored on magnetic tape and digitized off line at 256 Hz with 12-bit accuracy. Manual artefact rejection was performed before analysing the data to remove artefacts due to movement of the patient and other laboratory interference.

ECT seizure EEG records usually last between 0.5 and 3 min; therefore a typical series constitutes between 15 000 and 50000 observations. We study subsampled series with between 40 and 50 equally spaced observations per second. After subsampling we deal with series of a few thousand observations that produce graphical displays that are essentially indistinguish- able from the graphs produced by the original data. Fig. l(b) shows sections of an EEG recording at site F7. The original recordings of about 26000 observations were subsampled every sixth observation from the highest amplitude portion of the seizure, resulting in a series of 3600 observations (83.72 s). The graph shows sections of 500 consecutive observations taken from near the start, two central sections and close to the end of the seizure period. The series display high frequency oscillations at the beginning that slowly decay into lower fre- quencies accompanied by an increase in the amplitude of the signal relative to the amplitude

Page 3: Multichannel electroencephalographic analyses via dynamic

Multichannel Electroencephalographic Analyses 97

pl Fto = 0.02

F3 F,z F to 16.3

to = 39.6 X 3 C3 Cz C4 T4

to = 58.2 fN

T5 1.3 Pz 1,4 T6 J to=69.8X

to to+2.3 to+6.9 to+1 1.5

time (secs) (a) (b)

Fig. 1. (a) Representation of the Ictal1 9 electrode placement (by convention, electrodes located on the left-hand side are odd numbered whereas those on the right-hand side are even numbered; F, Fp, P, T, C and 0 refer to the frontal, prefrontal, parietal, temporal, central and occipital cortical regions (Dyro, 1989); electrodes down the centre of the scalp are labelled with the letter z; in particular, channel Cz is referred to as the vertex in EEG nomenclature); (b) sections of EEG voltage levels recorded during an ECT seizure at site F7

observed at initial states, until it finally decreases towards the end of the seizure. These general features are consistent across the 19 traces from this subject and seizure. However, variations over time in the frequency content and amplitude fluctuations appear to differ markedly across channels.

Section 2 summarizes previous analyses of some of the Ictall9 data based on univariate time-varying autoregressive (TVAR) models and decompositions (Prado and West, 1997; West et al., 1999). In Section 3, we begin by exploring latent factor models as a multivariate modelling approach to recover and describe latent processes underlying multiple series. We then discuss dynamic regression models with time-varying lags and leads that describe additional structure linking the series that TVAR and factor models cannot capture. Finally, Section 4 provides conclusions, comments and some discussion of future directions.

2. Univariate time-varying autoregressive models and decompositions

Previous studies of EEG series using TVAR models have been most encouraging in delivering useful insights into EEG structure and their connections with ECT outcomes. Sections of some of the series from the Ictall9 data set have been analysed using this approach in Prado and West (1997), West et al. (1999) and Krystal, Prado and West (1999). A key aspect of those analyses is the use of time series decomposition methodology to explore the non- stationary time-frequency structure of latent components of the EEG traces. In this aspect, the work extends the uses of TVAR models that have been used and validated earlier as empirical representations of other kinds of EEG data (Gersch, 1985, 1987; Kitagawa and Gersch, 1996).

In Prado and West (1997) selected portions of Ictall9 series were analysed by using univariate TVAR models in which the AR parameter vector evolves stochastically in time as a random walk. Formally, write xt for the observed value of a trace at time t, with t = 0, 1, .... The TVAR(p) model is xt = xtt + vt and (t = pt-I + wt where xt = (Xt,1, * - -, Xt_)

I

4t is the AR parameter vector at time t, {I vt} is a sequence of independent normal observation

Page 4: Multichannel electroencephalographic analyses via dynamic

98 R. Prado, M. West and A. D. Krystal

innovations with possibly time-varying variances a2 and {wt } is a sequence of independent p- vector parameter innovations. The degree to which 4t and a2 vary is determined via standard discount methods (West and Harrison, 1997). Suitable values of discount factors and p may be assessed via marginal likelihoods as discussed in West et al. (1999). Once the EEG series have been modelled via TVAR models the focus is on exploring the time-frequency structure of latent processes using decompositions based on the eigenstructure of the TVAR evolution matrices of the dynamic model. The basic result states that (West et al., 1999)

Pz Py

Xt= E Zt, + EYtZ ]=1 ]=1

where pz is the number of pairs of complex characteristic roots of the instantaneous AR characteristic polynomial defined by 4t at time t and py is the number of real characteristic roots, such that 2pz + py =p. The processes zt,j and yt,j are, under certain conditions dis- cussed in West et al. (1999), dominated by time-varying autoregressive moving average (TVARMA(2, 1)) and TVAR(1) processes respectively. The AR structure of each zt,j-process

2

data

0 23 46 69

time (secs) (a)

0

o 0

cr~ ~ ' O;

0~~~~~~~~~~~

0

0 23 46 69 0 23 46 69

time (secs) time (secs) (b) (c)

Fig. 2. (a) Data and estimated components in the decomposition of EEG series C3 based on a TVAR(12) model (from the bottom upwards, the graph displays the time series followed by two estimated components in order of increasing characteristic frequency-component 1 lies in the delta (0-4 Hz) band for most of the course of the seizure and component 2 lies in the theta (4-8 Hz) band); (b), (c) trajectories of the estimated characteristic frequency and modulus of the lowest frequency component in series C3

Page 5: Multichannel electroencephalographic analyses via dynamic

Multichannel Electroencephalographic Analyses 99

is quasi-periodic, with time-varying characteristic frequency and modulus w,,j and r,,j. Fig. 2 displays two of the estimated components and trajectories of the frequency and modulus of the lowest frequency component in the analysis of series C3 based on a TVAR(12) model. Fig. 2(a) graphs, from the bottom upwards, the data, followed by the estimated zt,j forj = 1, 2, in order of increasing characteristic frequency. Component 1 corresponds to the slow wave whose instantaneous characteristic frequency and modulus are displayed in Figs 2(b) and 2(c). This component lies in the delta range which is characteristic of slow waves manifest, in particular, in the middle and late phases of effective ECT seizures (Niedermeyer, 1993; Staton et al., 1981; Weiner et al., 1991; Weiner and Krystal, 1993). Component 1 dominates in amplitude and has moduli values higher than 0.9 during most of the course of the seizure. Component 2 lies in the theta band and is much lower in amplitude and modulus than component 1 is. Higher frequency components also appear and have much lower amplitudes than components 1 and 2 have.

Repeating the TVAR(12) analysis and exploring the decompositions for the 19 EEG series we find similarities in the patterns of behaviour of the dominant frequency components across channels (see Prado and West (1997)). As we move through the time course of the seizure, the instantaneous AR characteristic polynomials exhibit and maintain at least two dominant pairs of complex conjugate roots across the 19 series -dominant in the sense of having higher moduli and lower frequencies than the remaining roots have. These two pairs of characteristic roots, whose frequency and moduli trajectories are consistent across the 19 channels, correspond to the dominant 'seizure' latent processes living in the delta and theta frequency bands. These common patterns suggest the notion of the existence of at least one latent quasi-periodic process underlying the 19 series. Modelling the traces via latent factor models, with one or two common latent processes, is therefore appropriate.

3. Multivariate dynamic structure

3.1. Factor models and autoregressions A relatively simple model, based on the results obtained via univariate TVAR analyses, is suggested as a first step towards characterizing the system underlying the 19 series. Suppose that we have m series, and let Yi,t be the observation recorded at time t on electrode i (i= 1, ... ., m). Assume that the system is characterized by a latent process xt,

yi,t= f3ixt+ vi,t, p

Xt = Z OQjXt-j + r , (1) j=l I

ct = 4t-1 + wt,

where the fis are regression parameters or factor weights. The unobservable process xt has a TVAR(p) structure modelling the common seizure waveforms that consistently appeared in the decompositions of the 19 EEG series. The TVAR parameter vector 4t = (0t, 1, . . . Ot,p)' follows a random walk and vi,t, mt and wt are independent and mutually independent, zero- mean Gaussian innovations; we write these distributions as N(vi,t 0, vi), N(r,t 0, s) and N(wt 10, Ut) for some vi, s and variance-covariance matrices U, controlling the variability of Qt. Note that s is constant in our analyses here, though it could more generally be time varying.

Model (1) is a dynamic factor model (DFM) with a single-factor process xt and constant factor weights. Previous approaches to latent factor modelling have focused on stationary

Page 6: Multichannel electroencephalographic analyses via dynamic

100 R. Prado, M. West and A. D. Krystal

processes rather than the non-stationary, time-varying parameter versions here. Important contributions are those of Pefia and Box (1987) and Tiao and Tsay (1989), for example. Following the notation in Prado and West (1997) and Aguilar et al. (1999), a rather general DFM with k > 1 factors and m observed series can be written in matrix form

Yt= Btxt + vt, (2)

where Yt = (Yl, t * Ym, t)' is the m-dimensional vector of observations at time t, Bt = (/1 t, t *, fk t) iS an m x k matrix with 3i, t = (31,j, t, . . *, 3m,j, t)' and 3i j t, the factor weight relating the ith observation Yi,t to the jth factor xj, t at time t. It is usually assumed that Vt = diag(v1, t, . .v, Vt ) so that the dependences among the yi t are due exclusively to the xj, t. Additionally, xt = (x1 t, . . *, Xk, ) can be modelled via general dynamic linear models (West and Harrison, 1997). Modelling each xj,t as a TVAR process seems reasonable in the EEG framework. One important class of DFMs is that based on lagged latent factors. Suppose for instance that xlt = xt and X2,t = Xt-1, . . ., Xk,t = Xt-k+l. Then, if xt is a TVAR process, it follows that yi t is a TVARMA(p, q) process with q = max(p, k). It is typically assumed that k << m. In the EEG context one or two latent factors may be enough to explain the underlying structure driving the behaviour of the multiple signals and k > 2 but k < 19 factors may surely be needed to account for lagged values of the latent processes.

We begin by exploring the DFM described in equations (1) taking vi = v for all i. Note that model (1) is not identified. In fact, if we consider 3*= fi/c for some c 7 0 the model can be written in terms of d* x* = cxt and r* = cmt. To resolve this parameter identification issue we impose restrictions on the factor weights via pi = 1 for a specific channel i. In addition, we assume that the signal-to-noise ratio r = s/v is a fixed known quantity. With this model specification, we can proceed to posterior analysis once priors for the model parameters have been specified. In our analyses we routinely use standard reference priors for pi and v, p(o3) ox 1 and p(v) ox 1/v, and relatively diffuse normal priors for 40, i.e. N(0 10, uIp) with u -* oo. Posterior inferences are obtained via posterior simulation methods based on basic dynamic linear model theory (West and Harrison, 1997) and standard Gibbs sampling techniques for state space models (Carter and Kohn, 1994; Friihwirth-Schnatter, 1994), as detailed in Appendix A.

Models with a single latent factor, A3c, = 1 and fixed model order values p E {4, . . ., 8 } were fitted to the 19 EEG series. There is a recent and growing mass of literature on incorporating model order uncertainty for standard AR models via Markov chain Monte Carlo (MCMC) methods using ideas of stochastic variable selection and reversible jumps (see for instance Huerta and West (1999) and Troughton and Godsill (1997)). However, these approaches do not deal with model order uncertainty in TVAR models. Extensions to DFMs that take into account model order uncertainty in the TVAR latent structure are certainly imaginable, but they will lead to very significant increases in complexity and computational demands of the posterior simulation algorithms. We currently are satisfied with exploring a range of models with different orders to understand how resulting inferences depend on the choice and over what data-supported ranges they are insensitive. A single discount factor approach (West and Harrison, 1997) was used to specify {Ut }. Various choices of the implied discount factor 8 and r were explored by allowing these parameters to take values in intervals chosen on the basis of an exploration of corresponding joint marginal likelihood functions from the univariate TVAR analyses. From such analyses we inferred that appropriate values of 8 lie in the range 0.994-0.996, whereas appropriate values of r lie in the range 5-15.

Fig. 3 displays graphs summarizing the results obtained from a single latent factor model

Page 7: Multichannel electroencephalographic analyses via dynamic

Multichannel Electroencephalographic Analyses 101

(9~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~0

0 23 46 69 27 ime (secs)

0 23 69 0 23 69 time (secs) time (secs) 0.4 0.8

(a) (b)

Fig. 3. (a) Estimated latent factor process (top), frequency trajectory of the estimated dominant component for the latent process and approximate 95% posterior intervals at selected points (bottom left) and modulus trajectory of the estimated dominant component for the latent process and approximate 95% posterior intervals at selected points (bottom right); (b) estimated posterior means of the factor weights for the 19 EEG channels

with p = 6, 6 = 0.994 and r = 10. The convergence tests and results are based on a posterior sample of 500 draws taken from 3000 iterations of the Gibbs sampler after a bum-in of 3000 iterations. To explore the convergence of the chain various diagnostics, in addition to autocorrelation and trace plots for some of the model parameters, were carried out. In particular, the Geweke convergence diagnostic, the Heidelberger and Welch stationary test and the Raftery and Lewis diagnostic (see Brooks and Roberts (1999)) were performed for sequences corresponding to model parameters 01010, I33, X701 and x1231. This was done by using Bayesian output analysis program version 0.4.3 for R (see http: //www.public- heal th. uiowa. edu/boa). The 500 draws for the posterior sample were taken every sixth iteration of the 3000 obtained from the Gibbs sampler after the bum-in period. All the MCMC sequences considered passed the Heidelberger and Welch convergence test and neither the Geweke nor the Raftery and Lewis diagnostics gave evidence that convergence was not achieved for such sequences.

The graph at the top of Fig. 3(a) displays the time trajectory of the estimated posterior mean of xt, x,t.The TVAR(6) structure assumed for xt exhibits two pairs of complex characteristic roots across the time period of the data. Time trajectories of the frequency and modulus of the lowest frequency component of xt are shown in the bottom left-hand part of Fig. 3(a). This component exhibits the same general features displayed by the slow wave dominant components that appeared in the univariate TVAR decompositions of the 19 EEG series. The frequency decreases over time and lies in the delta range (0-4 Hz), whereas the instantaneous characteristic modulus consistently takes values above 0.85. The vertical bars represent approximate 95% posterior intervals for the instantaneous frequency and modulus at the selected time points, so providing pointwise indications of uncertainties about the estimated trajectories. The bottom right-hand part in Fig. 3 shows the estimated posterior means of the factor weights f3i for i = 1, . . ., 19, denoted by ,ei. Note that we set 3c, = 1. The values displayed at the approximate electrode locations correspond to the A9. In addition, an image created by linearly interpolating the A9i over the grid defined by the approximate electrode

Page 8: Multichannel electroencephalographic analyses via dynamic

102 R. Prado, M. West and A. D. Krystal

locations is displayed. This image is simply used to highlight the relationships between the estimated factor weights and is not the result of building a spatial dependence structure on the f3i into the model. Dark intensities correspond to high values of 3Bi whereas light intensi- ties correspond to low values. The graph exhibits a marked pattern of relationship across 'neighbouring' channels: channels located closer to C, have factor weights that are closer to 1. An element of asymmetry is also evident from this picture. Channels located at right frontotemporal sites have smaller weights than channels located at left frontotemporal sites. Approximate 95% posterior intervals were computed for each ,i. The upper bounds of such intervals were lower than 1 in magnitude for all sites. Posterior estimates were calculated for models with p = 4, . . ., 8 and various values of r, leading to similar conclusions in terms of the latent quasi-periodic structures of x, and the relationship between the factor weights for the 19 EEG series and the electrode locations.

The single DFMs analysed reveal relationships between channels - in terms of 4- that univariate TVAR models cannot capture. However, estimates of the residuals exhibit sub- stantial remaining structure. For most of the channels quasi-periodic patterns are left unexplained by single-factor models with constant factor weights. Such patterns may be explained in several ways. First, the assumption that one factor is enough to explain the structure underlying the signals is quite simplistic although necessary from the computational viewpoint. Fitting models with more factors is complicated as both parameter identification and interpretability, and computational difficulties increase significantly with the number of factors. A second likely reason for residual correlation is the need to allow for factor weights to vary over time. Additionally, cross-correlograms of the residuals display time-varying phase delays between some of the channels. In particular, cross-correlograms of the residuals for channels CZ and Fp2 show that the signal recorded at site Cz seems to be delayed with respect to the signal recorded at site Fp2 at central portions of the seizure course, whereas no delays are evident at early and late portions. Cross-correlograms for channels Cz and 01 show that the signal recorded at 01 is delayed with respect to the signal recorded at Cz for central portions of the seizure, displaying practically no delays at the beginning and the end of the seizure. This suggests exploring models that explicitly involve time-varying descriptions of the amplitude ratios and lag-lead structures across the channels.

3.2. Dynamic regression models with time-varying lag-lead structures Consider the model

Yi, t = 3(i, t)Xt-li, t + Vi, t (3) ,3(i, t) = 3(i, t- 1) + Wi, t

with yi t the observation recorded at time t on channel i, and with the following specifications.

(a) xt is the underlying process, assumed known at each t. (b) 1j, t is the lag or lead that yi t exhibits with respect to xt at time t. These parameters are

explicitly allowed to be time varying and lie in a prescribed set of possible values li t E {-ko, . . ., k, }. Here -ko and k, are bounds chosen a priori to specify maximum lag or lead values. Changes in 1, t are described via a one-step Markov chain model with specified transition probabilities (lit = k1li,t-1 = m), with k, m E {-ko, . . ., k1 }.

(c) 3(i, t) is the dynamic regression coefficient for channel i. In particular, if xt = yio,t and p(io t) = 1 for some io and all t, then 3(Q t) for i $ io is a direct measure of dependence between channels io and i. A random walk is adopted to model the evolution of f3(, t).

Page 9: Multichannel electroencephalographic analyses via dynamic

Multichannel Electroencephalographic Analyses 103

(d) vi,, are independent and identically distributed zero-mean observational error terms with variances vi, and wi,t are independent and identically distributed zero-mean system innovations assumed normally distributed with variances Si,t.

Given that xt is assumed known for all t and that vi,, and wi,t are independent across channels, equations (3) describe a collection of univariate models rather than a multivariate m-dimensional model. Eliminating the lag-lead parameters from the first equation in model (3) by taking conditional expectations (with all other parameters fixed) we obtain E(yi tIx., ,3.) as a weighted average of the lagged and led values Xt+k,, Xt+kol, . . ., xt, .., Xtk with weights p(li, t = -ko)f3(i, t), . , p(li t = kl)f3(i, t). Thus each channel is modelled as a time- varying dynamic regression on past, current and future values of xt, with coefficients reflecting an overall time-varying level of response /3(i t) together with changes in the relevance of lagged and lead values via the p(li, t = .)-terms. Given that xt is the same fixed underlying process for all channels it is possible to make comparisons of any two channels via /3(i t) and i, t, and their estimated trajectories over time. Further model components that we need to specify are priors on /3(i,o), on vi and the transition probabilities p(li,t = kli, t-I = m). The specification of conditional evolution variances si,t is handled, as usual, via discount factors. Posterior inferences may be obtained via MCMC algorithms, as detailed in Appendix A.

3.3. The IctaIl19 data revisited Dynamic regression models assuming that xt is the actual signal recorded at the vertex channel, i.e. xt = Yt,c,, were fitted to the Ictall9 data. Relatively diffuse normal-inverse gamma priors were placed on the regression coefficients; normal priors conditional on vi with means Mi = 1 for all i were used to model the regression coefficients. The transition probabilities p(li,t li, t-) were fixed for all t and uniform initial priors p(li, 0 = kIDO) = I /(kI + ko + 1) were considered. Discount factors in the range 0.99-0.999 were considered to control the vari- ability of f3(i t) over time. Such values were chosen on the basis of exploration of marginal likelihood functions from univariate analyses of models that regress each recorded signal on the signal recorded at C, and fixed lagged or led values of such signals. Posterior summaries were computed for models with li,t E {-3, . . ., 3} and transition probabilities such that p(li t = kIli t-I = k) > 0. 999 and where only movements of the type 'one step at a time' are permitted, i. e. p(li, t =illi, t- = i) = 0 for allj such thatj > i + 2 orj < i - 2. The choice of the set of values that the lags or leads can take is based on previous analyses of the cross- correlogram functions between each channel and C,.

The summary graphs displayed in Figs 4 and 5 are based on an analysis with such assumptions. The MCMC analysis produced 3000 draws from the posterior distributions for each channel taken after a burn-in period of 4000 iterations. To explore MCMC convergence, the Geweke convergence diagnostic, the Heidelberger and Welch stationary test and the Raftery and Lewis diagnostic (see Brooks and Roberts (1999)) were performed for some model parameters: vi for all i, /30,1756 ,301,3157 and 102,877. Again, the Bayesian output analysis program version 0.4.3 for R was used. Autocorrelations at lags 1, 5, 10 and 50 for each of the sequences of the parameters were very low - always less than 0.07 in absolute value - and similar results were obtained for the cross-correlation of the parameters. All the sequences considered passed the Heidelberger and Welch convergence test and neither the Geweke nor the Raftery and Lewis diagnostics gave evidence that convergence was not achieved for such sequences.

The transition probabilities for the model are displayed in Table 1 and the system discount factors were set to 6i = 0.996. These transition probabilities impose desirable smoothness

Page 10: Multichannel electroencephalographic analyses via dynamic

104 R. Prado, M. West and A. D. Krystal

tw26.7 t-52.3 tm73.3

0! ^$ \ /.4t1 0.65 0.8 10 1.2............2 0

Fig. 4 prvie th estimte posteio men of* th p.oefcnt fo th 19 channes at

macigher coefficien value wheea lih intesitie corspn toloervaue. Aspet geerl a ivnhanl shrsmr __'simlrite wit chnnl locte clsrt thsi

cossEnt with1 the ':':'. behaviour of the es d f

,~~~~~~~. 1. _. 1. i _J _

model anales (steeig. 3). No thec fastor whichaisemore evident towars the ten

oftecourse of the seizure , ewe h siae ofiinso hneslctda h

frontarighton sieoada the froutontal leftan sieas. bth valuersofthucouefficent are muchdeed smaler

For. channdels FtaneF (rtight-han psteide)thansoth cefins for te1 channels F n Falfthntsd)

seetdtable 1.nt Turansition prbaiities.(I Th kvalues dipae=a m)ppoiae lcrd

matchhighm atefft ie- 1 alesThreansitionhprobabilitiesa torrtesfolowngoloe values . ofpekt

~~~~~k- k=-J k= kJ k=2f

-2 0t9999 0.0001r0.0. r0 0 _ -1 0. r0 0_9999 0.00 0.0 0.0 0 0.0 m0.0 fa05 0.9999 0b00005 0.0 1 g. 0.0m0.0 y0.hi005 0.99 0n 5 s _

2 = 0. 0.0 0.0 00 I 0 =99

-2 0~~.9 9 .0; 01 m . I . I .

- 1 ~0.4 Q.B 0.999 10.0 5 0.00

Fig.~~ 4.0 Estimate posterio means0 oftednmcfcorwihsfralchnesa eetd onsdrnh course ~ of0 the seizure999 .00

restrction on0 the evluio of00 0asadled u.o9rs9ctrsmy 99cnidrd h

Page 11: Multichannel electroencephalographic analyses via dynamic

Multichannel Electroencephalographic Analyses 105

t-1 3.95 t-27.9 t-53.5

-2 - 0 2 2 Fig. 5. Dynamic lags and leads on the Ictall19 data set, based on posterior mean estimates

Fig. 5 displays the estimated lags and leads based on posterior means of the 1. -quantities at different time points. If a site displays the lightest intensity at a given t, then the signal recorded at this site is delayed with respect to the signal recorded at C, in two units of time. Signals recorded at occipital sites are delayed 2 units of time with respect to the signal recorded at CZ during the early-middle and middle seizure portions (see the graphs for t = 13.95 and t = 27.91). Similarly, if a site shows the darkest intensity at a given t the signal recorded at this site leads the signal recorded at Cz by 2 units of time. Central portions of the seizure display intense lag-lead activity characterized by lags in the occipital regions and leads in the frontal and prefrontal regions. Again, an image plot is superimposed to provide visual interpolation.

In addition, Fig. 6 illustrates how the model captures the lag-lead time-varying structure in the series. Fig. 6 displays the EEG traces for channel Cz (full curves), together with the estimated traces for channel 01 (dotted curves) during the early-middle and middle seizure periods. Clearly, channel 01 is delayed with respect to channel Cz during some seizure portions and the delay is bigger during the early-middle portions (compare Figs 6(a) and 6(b)). Samples of the standardized residuals were computed from the MCMC sequences and explored using standard residual diagnostics, graphics and correlation analyses. The sampled residuals generally exhibit no obvious residual structure, and cross-correlograms of 'the residuals displayed no apparent phase delays. However, for some of the channels, in par- ticular for channels Fpl and Fp2 the residual autocorrelations tend to be more appreciable towards the end of the series. Uncertainty increases towards the dissipation of the seizure and these channels are 'remote' with respect to the vertex. We believe that this does not invalidate our modelling approach as the main interest here is in studying time intervals corresponding to major seizure activity. However, refinements of the model that capture this moderate 'tail'

Page 12: Multichannel electroencephalographic analyses via dynamic

106 R. Prado, M. West and A. D. Krystal

200 -200

100 12

0 0I

a -200 -100

-200 -400J

-300

9.3 10.5 11.7 12.9 14.1 25.6 28.0 30.4 32.8

li me (secs) lime (secs) (a) (b)

Fig. 6. EEG traces for channel C, ( ) and estimated traces obtained from the dynamic regression model for channel 01 (. ) at (a) early-middle and (b) middle seizure periods

correlation may be considered in the future, particularly if the focus is on studying the seizure dissipation period. For example, extensions of the models here that regress a given signal on various processes x1, , . . *, X t- for example, in addition to regressing any given channel on channel CZ we may consider also other 'neighbouring' channels -might be considered.

The posterior summaries presented confirm the existence of a spatiotemporal structure linking the EEG channels. Channels located closely exhibit similar time-varying signal features over the course of the seizure. Smoothness conditions were imposed to describe the changes of model parameters over time, and even under such conditions it is possible to discover strong spatiotemporal patterns across the multiple signals.

4. Discussion

The analyses presented here demonstrate how these models may be used to explore the complex spatiotemporal structure underlying the Ictall9 EEG series. Previous univariate TVAR decompositions had suggested a common underlying structure characterized pri- marily by a persistent 'seizure waveform' whose characteristic frequency takes values of 4-5 Hz at the beginning of the seizure and decaying to much smaller values towards the end. Single-factor models confirm this structure, revealing also a spatial relationship across the 19 EEG traces. The need for models including more than two factors is highlighted by the structure of the residuals. However, such developments are not easy, as the computational complexity and interpretability of factor models increase considerably with the number of factors.

Our dynamic regression models have proven useful in discovering the latent structure underlying the 19 EEG series. It is clear from the estimated lag-lead structure that an EEG

Page 13: Multichannel electroencephalographic analyses via dynamic

Multichannel Electroencephalographic Analyses 107

signal recorded at a given site may exhibit time-varying delays with respect to another signal recorded at a different scalp location. Similarly, the values of the estimated regression coefficients are strongly related to the scalp location and the time interval considered. Models with more than one latent process and alternative structures for the class of transition probabilities models may be considered. Such developments are certainly of technical interest, though we are sceptical whether the additional modelling complexities will be at all worthwhile from the viewpoint of refining the interpretation and understanding of EEG traces.

The analyses presented here suggest that dynamic regression models can usefully assist researchers in improving the analysis of multichannel EEG data in general. Our models improve our ability to study interlead relationships dynamically over the course of an EEG recording compared with existing techniques. As an example, the application of dynamic regressions to a multichannel EEG ECT seizure recording allowed us to carry out the first continuous analysis of spatial interrelationships between the channels over time, and to identify the regional expression of seizure activity, the degree of spatial interrelatedness of EEG activity and the interlead lead-lag structure over the course of the seizure. In agree- ment with Krystal, West, Prado, Greenside, Zoldi and Weiner (1999), we found periods of consistent lead-lag structure in some seizures, suggestive of physiological travelling waves of electrical activity in the neurons of the cerebral cortex. Additionally we can now demonstrate how this structure varies with time. This information has important implications for the physiological mechanisms of generalized tonic-clonic seizures in terms of their manner of initiation, development and propagation.

Further directions for research within the multivariate modelling framework are under consideration. The TVAR decomposition results have a direct extension to the multivariate case (Prado, 1998; Aguilar et al., 1999). This is connected with the work of W. Gersch and co- workers (e.g. Gersch (1987) and Kitagawa and Gersch (1996)) although they have focused on studying multichannel EEG data via exploration of the spectral densities rather than on time domain features. Such developments and connections are currently under study.

Appendix A

A. 1. Posterior sampling algorithms for factor models Consider a model with k factors, m series and the following structure for t = 1, . . ., n:

Yi t - 3(i, l)Xl, t + * * * + /3(ik)Xk, t + Vi, t vi t - N( IO, v),

Xj t = '(j,t,i)Xj,t_ + * * * + (j,t,pl)Xj,t-pi + 1j,tg 17j,t N(.IO, s1)

(P,t = 11,t-1 + W1j t -N(I 0, Ujt),

with Oj t the vector of pj TVAR coefficients related to the factor xt,j; U, t are p1 x p1 matrices and vi, t j, t and w, t are independent and mutually independent innovations. We assume that the matrices Uj,t are known or specified by single discount factors and that rj = sj/v are known quantities for all j. Let Y = (y, ., Y) with yt' = (Yl t y, Y t), X = (Xl Xk,fl) with Xj,1 = (xj x ,1), U = (U, . Uk) with Uj = (Uj 1, . . Uj, n), B = (/3', . . , (3k), with /3j = (3(1j), /, 3(m,j))' and 4i = (01.. k) with 4j = (04& 1, . . ., qj1). Let p(v), p(/3), p(q$ 1) and p(xj,oIXJO ) for j = 1 . . . k denote the priors. The Gibbs sampling scheme iterates through the set of complete conditional posteriors given below. Details of the sampling steps are given in Prado (1998).

A.].]. Step I Sample X from p(XIY, B, V, v, s1. Sk). This reduces to sampling Xj1,1 from p{X,lY, B, Ok, V,

Page 14: Multichannel electroencephalographic analyses via dynamic

108 R. Prado, M. West and A. D. Krystal

1, * * *, Sk, X(-Xj1n)} with X(-Xj,,) the full set of factor processes values except Xi,,,. For each j, it is possible to write a standard dynamic linear model with parameters XA'n. Therefore, efficient genera- tion can be performed via forward filter-backward sampling algorithms (Carter and Kohn, 1994; Friihwirth-Schnatter, 1994).

A.1.2. Step 2 Sample 4j from p(IXj, nS S, Uj) forj = 1, . . ., k. Again, for each jwe can write a dynamic linear model with model parameters X>; therefore, efficient generation can be performed via forward filter-backward sampling algorithms.

A.1.3. Step 3 Sample v from p(vIY, B, X). This reduces to sampling the posterior proportional to p(v)v-(t+ exp(-/3/v) for some a > 0 and ,B > 0. Inverse gamma priors on v are conjugate.

A.1.4. Step 4 Sample B from p(BIY, X, v). This simplifies to sampling the posterior proportional to

p(/3j) exp ( uj tuj t/2V)

for each j with j-1 k

Ujt= - yt-E /x, t - E /xI, t- 1=1 1=j+l

The restrictions on the factor weights should be considered when sampling from the posteriors.

A.2. Posterior sampling algorithms for dynamic regression models with time-varying lag-lead structures In model (3), define yi = (yi. 1, Yi,n) x = (x 1. Xn), /3 = ((i, 1), * (i,n) ) li = ( i, 1 4 * li,n) and us = (ui 1, .I, n) Let p(lz,o), p(/3i,o) and p(vi) denote the priors. The Gibbs sampling scheme to generate posterior samples iterates through the set of complete conditional posteriors given as follows.

A.2.1. Step I Sample /3i from P(3i lYi, x, li, vi). Conditionally on yi, x, li and vi we have a dynamic linear model whose system parameter is /3i. Efficient generation can be obtained via forward filter-backward sampling.

A.2.2. Step 2 Sample 1i from p(liIyi, x, pi, vi). Following Carter and Kohn (1994), we sample li n from p(li, nIYi x, X3I, vi) and then for t = n-1, . 1 we sample from p(li,tIli,t+ , yi', x', /3, vi). A discrete filter is used to compute A(li nyI i, x, /3, vi) and p(li t y I Y, xt , Y I, vi)

A.2.3. Step 3 Sample vi from p(vilyi, x, /3i, li). This reduces to sampling the conditional posterior distribution proportional to

p(Vi)Vos o (+1) exp(-cngvi)

for some ae, ,B > 0. Inverse gamma priors on vi are conjugate.

Page 15: Multichannel electroencephalographic analyses via dynamic

Multichannel Electroencephalographic Analyses 109

References Aguilar, O., Huerta, G., Prado, R. and West, M. (1999) Bayesian inference on latent structure in time series (with

discussion). In Bayesian Statistics 6 (eds J. 0. Berger, J. M. Bernardo, A. P. Dawid and A. F. M. Smith). Oxford: Oxford University Prress.

Angeleri, F., Butler, S., Giaquinto, S. and Majkowski, J. (1997) Analysis of the Electrical Activity of the Brain. Chichester: Wiley.

Brooks, S. P. and Roberts, G. 0. (1999) Assessing convergence of Markov chain Monte Carlo algorithms. Statist. Comput., 8, 319-335.

Carter, C. K. and Kohn, R. (1994) Gibbs sampling for state space models. Biometrika, 81, 541-553. Dyro, F. M. (1989) The EEG Handbook. Boston: Little, Brown. Friihwirth-Schnatter, S. (1994) Data augmentation and dynamic linear models. J. Time Ser. Anal., 15, 183-202. Gersch, W. (1985) Modeling nonstationary time series and inferring instantaneous dependency, feedback and

causality; an application to human epileptic seizure event data. In Identification and System Parameter Estimation. Proc. 7th IFAC-IFORS Symp., York, pp. 737-742.

(1987) Non-stationary multichannel time series analysis. In EEG Handbook, Revised Series (ed. A. Gevins), vol. 1, pp. 261-296. New York: Academic Press.

Huerta, G. and West, M. (1999) Priors and component structures in autoregressive time series models. J. R. Statist. Soc. B, 61, 881-899.

Kitagawa, G. and Gersch, W. (1996) Smoothness priors analysis of time series. Lect. Notes Statist., 116. Krystal, A. D., Greenside, H. S., Weiner, R. D. and Gassert, D. (1996) A comparison of EEG signal dynamics

in walking, after anesthesia induction and during electroconvulsive therapy seizures. Electroenceph. Clin. Neur- physiol., 99, 129-140.

Krystal, A. D., Prado, R. and West, M. (1999) New methods of time series analysis of non-stationary eeg data: eigenstructure decomposition of time varying autoregressions. Clin. Neurphysiol., 110, 1-10.

Krystal, A. D., West, M., Prado, R., Greenside, H. S., Zoldi, S. and Weiner, R. D. (1999) The EEG effects of ECT: implications for rTMS. Depressn Anx., to be published.

Niedermeyer, E. (1993) Epileptic seizure disorders. In Electroencephalography (eds E. Niedermeyer and F. Lopes da Silva), 3rd edn, pp. 461-564. Baltimore: Williams and Wilkins.

Pefia, D. and Box, G. E. P. (1987) Identifying a simplifying structure in time series. J. Am. Statist. Ass., 82, 836-843. Prado, R. (1998) Latent structure in non-stationary time series. PhD Thesis. Duke University, Durham. Prado, R. and West, M. (1997) Exploratory modelling of multiple non-stationary time series: latent process structure

and decompositions. In Modelling Longitudinal and Spatially Correlated Data (ed. T. Gregoire). New York: Springer.

Staton, R. D., Prudic, J., Devanand, D. P. and Prudic, J. (1981) Stimulus intensity, seizure threshold and seizure duration: impact on the efficacy and cognitive effects of electroconvulsive therapy. New Engl. J. Med., 328, 803-843.

Tiao, G. C. and Tsay, R. S. (1989) Model specification in multivariate time series (with discussion). J. R. Statist. Soc. B, 51, 157-213.

Troughton, P. T. and Godsill, S. J. (1997) Bayesian model selection for time series using Markov Chain Monte Carlo. Technical Report. Signal Processing and Communications Laboratory, Department of Engineering, University of Cambridge, Cambridge.

Weiner, R. D., Coffey, E. and Krystal, A. D. (1991) The monitoring and management of electrically induced seizures. Psychiatr. Clin. N. Am., 14, 845-869.

Weiner, R. D. and Krystal, A. D. (1993) EEG monitoring of ECT seizures. In The Clinical Science of Electro- convulsive Therapy (ed. C. E. Coffey), pp. 93-109. Washington DC: American Psychiatric Press.

(1994) The present use of electroconvulsive therapy. A. Rev. Med., 45, 273-281. West, M. and Harrison, J. (1997) Bayesian Forecasting and Dynamic Models, 2nd edn. New York: Springer. West, M., Prado, R. and Krystal, A. D. (1999) Evaluation and comparison of EEG traces: latent structure in

nonstationary time series. J. Am. Statist. Ass., 94, 375-387. Zoldi, S., Krystal, A. D. and Greenside, H. S. (2000) Stationarity and redundancy of multichannel EEG data

recorded during generalized tonic-clonic seizures. Brain Topogr., 12, 187-200.