nikhil chandaria - final year project

8/7/2019 Nikhil Chandaria - Final Year Project

1/52

A diagnostic tool analysing ablative heart surgery

Nikhil Chandaria

CID:00466532

Supervisor: Dr. Colin J. Cotter

1st of June, 2010

Abstract

This is a report on creating a filter to classify an electrocardiogram. The Ornstein-Uhlenbeck processis used to model the signal between heartbeats and we investigate the use of the Ensemble KalmanFilter to estimate the parameters of this stochastic process. We find that the filter is unable to estimatethe drift term in the stochastic differential equation. We then propose a modification involving the

Ensemble Square Root filter and a Bayesian approach to estimating this parameter, which proves to bemore successful, however we discover that the Ornstein-Uhlenbeck process proves to be an insufficientmodel for the signal between each heartbeat from a patient.


2/52

Acknowledgements

I would like to thank Dr. Colin Cotter for his encouragement, advice and insight throughout thisproject. I would also like to thank Professor Nicholas Peters and Dr. Louisa Lawes for their explanationof atrial fibrillation, the ablation procedure and for providing the electrocardiogram data.


3/52

Contents

1 Introduction 1

1.1 Atrial Fibrillation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Ablative Surgery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2 Stochastic Processes 3

2.1 Euler-Maruyama Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2 Ornstein-Uhlenbeck Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2.1 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2.2 Validity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Kalman Filters 7

3.1 Linear Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73.1.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83.1.2 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3.2 Extended Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.2.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.2.2 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

4 Ensemble Filters 12

4.1 Particle Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124.2 Ensemble Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

4.2.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144.2.2 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4.3 Ensemble Square Root Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194.3.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.3.2 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4.4 Estimating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284.4.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.4.2 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.4.3 Changes in Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324.4.4 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

5 Application to Heart Surgery Data 34

6 Final Remarks 37

6.1 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376.2 Recommendations for Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Bibliography 39

A Solution to OU Process 41

B Estimation Issue 42

C Maximum Likelihood Estimator 44

D Ornstein-Uhlenbeck Process MATLAB Code 45

E Final Filter MATLAB code 46

F Removing Heartbeats from ECG data 49

i


4/52

1 Introduction

Atrial fibrillation (AF) is classified as cardiac arrhythmia; an irregular heartbeat. It is associated withproblems within the electrical conduction system of the heart. Within the UK there are at least 46,000people diagnosed every year (Iqbal et al., 2005); the subsequent result is that 459 million is spent by theNational Health Service (Stewart et al., 2004) which is roughly 1% of the NHS budget.There are a variety of treatments available for patients who suffer from AF including medicinal, eletrical

and chemical cardioversion and ablative surgery. Within this project we are going to focus on catheterablation. We will consider the detection of AF and the ablative procedure. In section 2 we will considerstochastic processes as a means for modelling the heart in order to develop an analytical tool. In sections3 and 4 we will consider filtering techniques for estimation purposes. In section 5 we will examine theapplication of filtering techniques to data from the electrocardiogram (ECG).

1.1 Atrial Fibrillation

In this subsection we will examine atrial fibrillation and methods of detecting the signal.

We will first consider normal sinus rhythm; a regular heartbeat and how the electrical signal is con-ducted through the heart. It is possible to classify the heartbeat into different stages with the use of

an ECG. A typical heartbeat is shown in figure 1.1.2. We can then identify the electrical impulse thatgenerates in the sinoatrial node (shown as the sinus node in figure 1.1.1) as the P wave; this is what causesthe contraction of the atria. This will push the blood from the atria into the ventricles. The electricalsignal will then travel to the atroventricular (AV) node upon which the signal will cause the ventricles tocontract thus forcing blood from the heart to the rest of the body. The delay between the contractions ofthe atria and the ventricles is characterized by the PR segment; without this delay the entire heart wouldbeat at the same time. The QRS complex denotes the spread of the electrical activity from the atria tothe ventricles. Finally the repolarization of the ventricles is shown by the presence of the T wave.

Fig. 1.1.1: Diagram illustrating the main areas of the heart

For a patient who suffers from AF there are two main methods for detection of the condition: theregularity of the heartbeat or the more stronger indicator is the absence of P waves (American HeartAssociation, 2008). The condition can cause palpitations, fainting and congestive heart failure.

1


5/52

Fig. 1.1.2: A typical heartbeat as shown on an ECG

1.2 Ablative Surgery

In this subsection we will discuss the ablative surgery method and the issues associated with it.

Ablative surgery is a procedure where surgeons insert a number of electrodes into a patients heartand measure the electrical activity (Sivakumaren, 2009). A catheter using a high frequency alternatingcurrent is then used to burn any abscesses away that may exist. This is done by a surgeon searching forany abnormal impulses using a roving electrode. Once the surgeon has located and ablated the abscessthey will then continue to search for any other sources of impulses that may exist. The aim is that thiscan aid in returning the heartbeat to a normal sinus rhythm.This method, however, is subjective and can lead to varying success rates between surgeons and treatmentcenters (Calkins et al., 2007). We believe that this is because there is no method for determining whethera signal is displaying disorganized electrical activity as is the case under AF. In this project we aim to

work on a method that will classify whether a signal is noise or whether it is indeed atrial fibrillation. Wehope that this research will help provide an objective decision making process for surgeons in being ableto determine whether a signal is noise or an abnormal electrical activity causing AF. This may provide amethod for a surgeon to perform a post operation analysis on the procedure to understand the impact ofablation on the patient and to able to distinguish between noisy signals and anomalous electrical activityin the heart.

2


6/52

2 Stochastic Processes

A stochastic process is a random process. The most common example is that of Brownian motion. Astochastic process can only be described by a probability density function (pdf). In the case of an ordinarydifferential equation (ODE) a given initial condition will give one real evolution through time however fora stochastic differential equation (SDE) with a given initial condition there can be any number of possiblepaths (Risken, 1996). An example of a stochastic process is a Wiener process which has the following

properties: W0 = 0

W N(0, t s)We can then generate a Wiener process after a set time span using the following equation:

xt = xt1 + N(0, t s) (2.0.1)

with the initial condition to this equation being x0 = 0.

Fig. 2.0.1: Demonstration of 4 sample Wiener process paths generated. This shows the randomness of gen-erating a Wiener process.

Figure 2.0.1 displays an example of 4 different Wiener processes generated using a time difference of 0.01and 100 time steps.

Figure 2.0.2 contains the evolution of 10,000 Wiener processes using the same conditions used to generatethe paths shown in figure 2.0.1. Based on these plots and (Risken, 1996) we can infer that the evolutionof the standard deviation of particles is given by:

(t) =

t (2.0.2)

3


7/52

Fig. 2.0.2: Demonstration of 10,000 Wiener processes. This demonstrates that the process displays a standarddeviation of

t.

2.1 Euler-Maruyama Method

In this subsection we will discuss the Euler-Maruyama method for solving SDEs numerically.

For many SDEs of the formdXt = a(Xt)dt + b(Xt)dWt (2.1.1)

there is no explicit solution (Risken, 1996) that can be obtained, thus a numerical integrator is requiredto obtain the approximate solution to an SDE. The simplest numerical method is the Euler-Maruyama(EM) method which aims to find a solution to an SDE using the following equation:

Xk+1 = Xk + a(Xk)t + b(Xk)Wk (2.1.2)

where Wk = W(tk) W(tk1). This method has order of convergence of n = 12 and a weak order ofconvergence of n = 1 (Kloeden et al., 2003). This means that if we want to improve the precision of theEM method by 10 times we would need to reduce the time step by 100 times. We can see that the aboveequation is the forward Euler method for ordinary differential equations.

2.2 Ornstein-Uhlenbeck Process

This subsection deals with the mean reverting process called the Ornstein-Uhlenbeck process which will beused a method for modelling the signal between each heart beat. The application and validity of the methodwill also be discussed.

The Ornstein-Uhlenbeck (OU) Process is a mean reverting stochastic process (Uhlenbeck and Ornstein,1940) of the form:

dXt = ( Xt)dt + dWt (2.2.1)

4


8/52

where ,, > 0 and Wt is a Wiener process. The exact solution of the OU process has the followingproperties:

E[Xt] = X0et +

1 et

(2.2.2)

and

var(Xt) =2

2

1 e2t

(2.2.3)

For proof of these two statements please refer to appendix A. In order to generate a sample path simulationof equation 2.2.1 we need to apply an AR(1) method:

xt = xt1et +

1 et

+

1 e2t

2N(0, 1) (2.2.4)

An interesting note is that the OU process displays a Markov chain:

P(Xn+1|X1, X2,...,Xn) = P(Xn+1|Xn) (2.2.5)Applying the parameters laid out in table 2.2.1 results in figure 2.2.1.

100 0

1

t 0.01

Number of points 100

Total time 1s

Table 2.2.1: Paramaters for Sample OU Process

Fig. 2.2.1: A sample Ornstein-Uhlenbeck process generated using = 0, = 100 and = 1 with 100 pointsand a total time for the process of 1 second. This demonstrates the mean reverting nature of the

process.

2.2.1 Applications

There are various applications for the Ornstien-Uhlenbeck process such as financial modelling. A popularuse of the process is for commodity pricing using the parameters as an equilibrium, as the risk of theproduct and the sensitivity to external factors. It has also been used in modelling the cardiovascularsystem in animals (Gartner et al., 2010).

5


9/52

2.2.2 Validity

The importance of examining the Ornstein-Uhlenbeck process is to assess whether it will be a usefulmodel for devising a tool that will be able to process data from an electrocardiogram (ECG).The main requirement is that the process can match the data taken from an ECG. The idea is that thesignal between heartbeats is mean reverting and thus we can use the OU process to generate a signalthat will be analogous to the signal between heart beats. In order to do this we need to consider thedata from a surgery; as we discussed in section 1.1 the absence of P waves illustrated by an ECG is astrong indicator of the presence of atrial fibrillation thus it is important to examine the signal betweenthe heartbeat to detect the presence of the P wave. In order to do this it is important that we filter outthe heartbeat to understand whether the stochastic process chosen is suitable.

Fig. 2.2.2: The bottom graph shows an unmodified ECG. The top diagram is the same ECG with the heartbeatremoved. This shows that the signal between each heartbeat shows a mean reverting nature.

Figure 2.2.21 gives a comparison of the data with and without heartbeats. We can see that the processis indeed mean reverting with a mean of roughly -400. Another important check to see that the process issuitable for the task is whether the data itself is Gaussian by taking a histogram of the data. As we cansee in figure 2.2.3 the filtered data does display a Gaussian distribution therefore for modelling purposesthe Ornstein-Uhlenbeck process should be sufficient.

Fig. 2.2.3: A histogram of the ECG with the heartbeat removed. This shows that the information is distributednormally.

1This was created by using a tool to check for monotonicity in a set number of points in an ECG and thus remove aheartbeat. The method is displayed in appendix F

6


10/52

3 Kalman Filters

A Kalman filter (Kalman, 1960) is a discrete recursive numerical method used to filter noisy signals andestimate the true signal and infer other parameters that are associated with the system. It was originallydeveloped for use in trajectory estimation for the Apollo space program (McGee and Schmidt, 1985).

Since its conception it has become used in many everyday applications such as GPS and RADARtracking, numerical weather prediction (NWP) (Evensen, 1992), turbulence modelling (Majda et al.,

2010) and financial modelling (Krul, 2008).The filter relies on the fact that the true state can be inferred from the state from the previous timestep; it is a Markov chain. This allows for the filter to be used in modelling stochastic processes.

This section discusses the developments of Kalman filters and discusses their suitability towards theproblem of estimating the Ornstein-Uhlenbeck process parameters.

3.1 Linear Kalman Filter

In this subsection we will deal with the linear Kalman filter.

The linear Kalman filter tries to estimate the state x Rn using the following stochastic differenceequation (Welch and Bishop, 2006):

xk = Axk1 + Buk1 + wk1 (3.1.1)

where A is the state transition matrix of size nn, B (size n l) relates an optional control input (u Rl)with the state x and wk is white process noise of the form P(w) N(0, Q) where Q is the process noisecovariance. The filter has an observation equation of the form:

zk = Hxk + vk (3.1.2)

where zk Rm is the measurement of a signal, H is a measurement operator of size m n and vk is awhite measurement noise of the form P(v) N(0, R) where R is the measurement noise covariance. Itis important to note that the symbols used in the above definitions vary depending on the book, paperor notes used to reference the Kalman filter. In order to understand how the Kalman filter works the

following terms also need to be defined: xfk = prior estimate (3.1.3)xak = posterior estimate (3.1.4)Along with these two terms, we can define the following covariance matrices:

Pfk = E[(xk xfk)(xk xfk)T] = prior covariance (3.1.5)Pak = E[(xk xak)(xk xak)T] = posterior covariance (3.1.6)

With all the required terms now defined, it is possible to now display the Kalman filter equations. Theequations can be broken down into two categories: prediction and correction steps.

The equations associated with the prediction are as follows:xfk = Axak1 + Buk1 (3.1.7)Pfk = AP

ak1A

T + Q (3.1.8)

The equations associated with the correction step are as follows:

Kk = Pfk H

T(HPfk HT + R)1 (3.1.9)xak = xfk + Kk(zk Hxfk) (3.1.10)

Pak = (I KkH)Pfk (3.1.11)

7


11/52

The yet-to-be defined term, Kk, is the Kalman gain matrix which relates the prior and posterior estimates2.

3.1.1 Algorithm

This section deals with the the pseudocode for the linear Kalman filter.

Algorithm 3.1.1 Linear Kalman Filter Algorithm

define A,B,u,Q2

and R2

for k = 1 to N doxfk = Ax

ak1 + Buk1

Pfk = APak1A

T + Q

Kk = Pfk H

T(HPfk HT + R)1

xak = xfk + Kk

zfk Hxfk

Pak = (I KkH) Pfk

end for

3.1.2 Application

In order to understand how the Kalman filter operates it is important to apply it to the underlying prob-lem at hand: estimating the Ornstein-Uhlenbeck process however we do have to deal with the issue thatthe process is nonlinear when we try to estimate all parameters (x,, and ) thus for understanding thefilter it is assumed that the only parameter that is unknown is x and all others are known thus simplifyingthe problem into a linear problem.

In doing so we generate an Ornstein Uhlenbeck process using the parameters as defined in table 2.2.1 andusing the Kalman filter parameters as defined in table 3.1.1 we are able to call the filter to estimate thex state of the process.

x0 2

P0 1

A edt

B

1 edtu

Q2 2

2

1 e2dt

R2 0.01

Table 3.1.1: Parameters used to estimate the position of the OU process when passed through a linear KalmanFilter. It is assumed that , and are known.

Figure 3.1.1 shows that the Kalman Filter works well in trying to estimate the true state given noiseadded to the initial signal generated for this application. While it is not perfect it does give us a goodidea of how the filter operates by forecasting and correcting the signal and it will not completely believethe incoming state and instead modify it closer to the true state. An important note is that the filter canbe further fine tuned by reducing the value of R2 (model measurement noise covariance).This section has provided a grounding in the Kalman filter however as this filter is linear it is of little

2For a more comprehensive description of the Kalman filter and deriving the equations please refer to (Simon, 2006)

8


12/52

use to us as we are looking to estimate all parameters in the Ornstein-Uhlenbeck process thus we need toconsider non-linear options.

Fig. 3.1.1: Example of the Kalman Filter estimating the Ornstein-Uhlenbeck process. This demonstrates theability for the filter to estimate the true state from the noisy observations.

3.2 Extended Kalman Filter

In this subsection we will go on to discuss the extended Kalman filter used for non-linear problems.

The linear Kalman filter does have an inherent problem in that it can only be applied to a linear system;

in the example of the Ornstein-Uhlenbeck process this means that the only element that can be estimatedis the position, x. The aim is to estimate all parameters associated with the OU process (x,, and )therefore we need to consider the case where the filter takes into account non-linearity. The first step indoing so is the extended Kalman filter (EKF).

The EKF works by linearizing the set of equations that we will be using to estimate the process. Ifwe use the generalised equation (Welch and Bishop, 2006):

xk = f(xk1, uk1, wk1) (3.2.1)

zk = h(xk, vk) (3.2.2)

We can then linearize the equations using a Taylors expansion and apply them to the filter equations

provided we have the following Jacobian matrices:

Aij =fixj

(xk1, uk1, 0) (3.2.3)Wij =

fiwj

(xk1, uk1, 0) (3.2.4)Hij =

hixj

(xk, 0) (3.2.5)9


13/52

Vij =hivj

(xk, 0) (3.2.6)where xk is a posterior estimate as defined in section 3.1. With the above definitions we can define theextended Kalman filter equations. Again they can be split into estimate and correction steps.

The estimation equations are:

x

k = f(xk1, uk1, 0) (3.2.7)Pfk = AkPak1ATk + WkQk1WTk (3.2.8)The correction equations are:

Kk = Pfk H

Tk (HkP

fk H

Tk + VkRkV

Tk )

1 (3.2.9)xk = xk + Kk(zk h(xk , 0)) (3.2.10)Pak = (I KkHk) Pfk (3.2.11)

3.2.1 Algorithm

The algorithm for the extended Kalman filter is very similar to the linear variant as seen in algorithm

3.1.1 in section 3.1.1.

Algorithm 3.2.1 Extended Kalman Filter Algorithm

define f(x, u), A0, W0, H0, V0, Q2 and R2

for k = 1 to N doxfk = f(xk1, uk1, 0)

Pfk = AkPak1A

Tk + WkQW

Tk

Kk = Pfk H

Tk (HkP

fk H

Tk + VkRV

Tk )

1

xak = xfk + Kk

zfk h(xfk , 0)

Pak

= (I

Kk

Hk

) Pf

k

redefine Ak, Wk, Vk and Hk {H and V will in most cases remain constant}end for

3.2.2 Application

In order to apply the EKF to the OU process we have to apply the Jacobian matrices for the EKFto the problem at hand. By going back to equation 2.2.1 and applying persistence equations for thenon-observed states with added artificial noise. The reason for this is to prevent these parameters fromcompletely settling to one value. For the problem at hand the patient undergoing surgery may be ablatedthus changing the shape of their ECG therefore we need to ensure that the model covariance for the

parameters does not settle to 0 and thus stop believing the data being taken in from electrodes attachedto the patient:

fk = ak1 + CdW (3.2.12)fk = ak1 + CdW (3.2.13)fk = ak1 + CdW (3.2.14)

10


14/52

We obtain the following Jacobian matrices:

A =

Xt 00 1 0 00 0 1 00 0 0 1

(3.2.15)

W =0 0 0 0 C 0 00 0 C 0

0 0 0 C

(3.2.16)H =

1 0 0 0

(3.2.17)

V =

0

(3.2.18)

Applying these matrices to the filter results in very poor performance and as such the figure has notbeen displayed. The filter diverges with the xt estimation dropping to values O(10172) after 17 time steps.The reason for this is that the extended Kalman filter suffers from linearization error which could explain

the reason for the divergence of parameters. Other approaches to the extended Kalman filter could bepursued such as hybrid filters which considers a continuous-time system with discrete-time measurementsor by looking at higher order approaches to the linearization however despite these options the EKF canbe very difficult to tune and can give unreliable estimates depending on the severity of the nonlinearity ofthe system (Simon, 2006). This linearization of the covariance error associated with the EKF can resultin unbounded linear instabilities for the error evolution (Evensen, 1992). Therefore other filter types needto be examined as an alternative to higher order linearizations which brings us to the ensemble family offilters as discussed in section 4.

11


15/52

4 Ensemble Filters

Ensemble filters are alternatives to the traditional filters as discussed in section 3 of which the two mostwell known are the ensemble Kalman filter and the particle filter. In the case of ensemble filters the errorcovariance matrix is represented by a large ensemble of model realizations. The uncertainty in the systemis represented by a set of model realizations rather than an explicit expression for the error covariance(Evensen, 2009). The model states are then integrated forward in time to predict error statistics. Research

has also shown that the use of ensemble filters for non-linear models costs less computationally than anextended Kalman filter (Evensen, 2006). Subsequently the ensemble filters have found widespread usewhen handling a large state space such as in NWP.

4.1 Particle Filter

In this subsection we will discuss particle filtering techniques used for estimating non-linear systems wherethe probability density function is non-modal.

The particle filter is a sequential Monte Carlo algorithm that, as mentioned in section 4, uses an en-semble of N members, or particles, to estimate the characteristics of a system. It is a computationalmethod of implementing a Bayesian estimator. In order to understand how it works we must first look

at Bayes theorem to understand that the particle filter computes the statistics of the system from whichinformation can be extracted. If we begin with our system and measurement equations (Simon, 2006):

xk+1 = f(xk, wk) (4.1.1)

zk = h(xk, vk) (4.1.2)

p(xk|Zk) = p(zk|xk)p(xk|Zk1)p(zk|xk)p(xk|Zk1) dxk (4.1.3)

Where Zk denotes measurements z1, z2,...,zk. Equation 4.1.3 does pose some problems because thedenominator can prove to be intractable hence in many cases it is necessary to use a delta function tointegrate the function and to estimate the probability density function of the system. By being able to

evaluate equation 4.1.3 we will be able to integrate our model in time by using the Euler-Maruyamamethod or by using the exact solution to the OU process.

Now that we understand Bayes theorem we can then begin to look at the particle filter and how toapply it. Unlike the family of Kalman filters, the particle filter does not assume that the distributionis Gaussian which means evaluating the pdf is much more difficult therefore we need to represent the it

using a series of weighted particles where (i)t represents a normalized weight for the ith particle at time

time t.P(xt|Zt) =

i

t1 (xt xt1) (4.1.4)

Where Zt = (zt, zt1,...,z0). To initialize the particle filter we distribute a set of N particles based on a

known pdf. We shall assume the notation xak,i andxfk,i where k is the time step, i is the particle number

and a donates the analysis step and f denotes the forecast state. If we begin by evaluating each particleand generating a prior state:

xfk,i = f(xak1,i, wk1) (4.1.5)

We then compute the relative likelihood of each particle by evaluating the pdf p(zk, xfk,i) which we will

denote as qi. We then normalize each likelihood to obtain the weight of each particle:

ik =qiN

j=1 qj(4.1.6)

12


16/52

Once we have this normalized weight we are able to resample each particle to generate the posterior statexak,i according to the relative likelihood and thus we have our pdf p(xk|zk). The particle filter does sufferfrom some problems; namely sample impoverishment in which case all the particles will collapse to thesame value (Simon, 2006). There are methods of reducing this impoverishment such as adding randomnoise or by modifying the resampling step by using a Monte-Carlo Metropolis Hastings algorithms. Whilethis illustrates the use of particle filters we can simplify the problem because we have shown that we areusing a Gaussian distribution thus allowing us to move to less computationally expensive methods.

4.2 Ensemble Kalman Filter

In this subsection we will examine the ensemble Kalman filter used for estimating a non-linear problemwith the assumption that the model displays a normal distribution. This subsection will also go onto dis-cuss modifications to the filter required to prevent members collapsing to a single value.

The ensemble Kalman filter (EnKF) works similar to the method to the particle filter however it iscomputationally less expensive because of the assumption that the distribution of the system is Gaussianand that every member has an equal weighting. Evensen (2006) shows that the EnKF is a special versionof the particle filter where the update step is approximated by a linear update step using just the meanand covariance of the pdf. In order to avoid confusion the notation for the EnKF will be slightly different

to that which has been proposed in section 3. From now on we will use the notation that xfk,i is the ithforecast ensemble member at time k, xak,i is the corrected ith member at time k and in our particularapplication x = (x,,,). We can then revert back to our system equations 3.2.1 and 3.2.2 and usethem for our analysis step.

xfk,i = f(x

ak1,i, uk1, 0) (4.2.1)

zk,i = h(xfk,i, 0) (4.2.2)

In section 3.1 we defined the covariance matrices for the prior and posterior distributions in equations3.1.3 and 3.1.4 however we need to pursue a slightly different method to establish the covariance matrices.We begin by defining the matrix Xf RnN as the matrix of ensemble member:

Xf

k

= xfk,1, xfk,2,..., xfk,N (4.2.3)We can then define a matrix Xk RnN as the matrix of the ensemble mean:

Xk = Xfk1N (4.2.4)

where 1N is a matrix where all entries are equal to 1/N. Once we have these definitions we can assemblea matrix of fluctuations:

Xfk = X

fk Xk (4.2.5)

And finally we can now assemble our error covariance matrix:

Pfk =1

N 1 Xfk

X

fk

T(4.2.6)

Now that we have our error covariance matrix we can proceed to perform our Kalman filter updateequations:

Kk = Pfk H

T(HPfk HT + R) (4.2.7)

xak,i = xfk,i + Kk

zk Hxfk,i

(4.2.8)

Pak = (I KkH) Pfk (4.2.9)The most important change is the way in which the prior error covariance matrix is assembled and theway in which the Kalman update equation is applied; the update equation is applied to every ensemblemember.

13


17/52

4.2.1 Algorithm

Algorithm 4.2.1 demonstrates how to implement the EnKF.

Algorithm 4.2.1 Ensemble Kalman Filter Algorithm

define N {number of members}distribute N normally with a chosen variancefor k = 1 to T do

for i = 1 to N doxfk,i = f(x

ak1,i, uk1)

end for

Xfk = X

fk (I 1N)

Pfk =1

N1Xfk

X

fk

TKk = P

fk H

T

HPfk Ht + R

1

Pak = (I KkH) Pfkfor i = 1 to N do

xak,i = xk,i + Kk zk Hxfk,iend for

end for

4.2.2 Application

We can now begin to examine how well the EnKF performs and decide as to how much further the filtercan be taken in reaching our goal of estimating the parameters from an OU process.

xt, and Estimation To start we will first look at estimating the diffusion parameters ( and )

along with xt in the OU process while assuming that we know . Therefore the equations that we will beusing for our analysis will be similar to what was seen in section 3.2.2 including the persistence equationsfor and and the exact solution to the OU process for xt. In this section we will present the resultfrom running the EnKF. We will create a single OU process as a benchmark with the parameters in table4.2.1. With these parameters set and an OU process saved we will begin by examining how well the filter

10

200

500

Number of points 1,000,000

Total time 100s

Table 4.2.1: Parameters for the benchmark OU process which will be used as a control for the developmentof the filter.

performs with 100 ensemble members and a measurement noise variance of 0.01. The initial variance ofthe ensemble is 1 109 which has been chosen through empirical testing. In this test we have removedthe Wiener process from the and parameters. We can see in figure 4.2.1 that the xt estimator worksvery well. The plot of the first and last 100 points have been included to demonstrate the ability of thefilter to estimate the measured state. The numerical result of the simulation can be seen in table 4.2.2.The total time taken has been displayed to understand the trade off between the amount of ensemble

14


18/52

members and the computational cost of running the simulation.

27.513

197.5599

Time taken 256.512845s

Table 4.2.2: The result of the EnKF estimating the benchmark OU process with 100 ensemble members.

Fig. 4.2.1: Graphical result of the EnKF with 100 members estimating the benchmark process. The top twographs show the first and last 100 points of the estimation process. The third graph shows theestimation for and the final graph shows the estimation of . The filter demonstrates that themethod works well for estimating the position, xt however the method fails to correctly estimatethe non-observed parameters correctly using the persistence model.

While the and plots are not clear in figure 4.2.1 if we zoom into the first 1000 points we can seethe change in both parameters before they settle to a the set values as can be seen in figure 4.2.2.

We can see that the filter can estimate the parameters that are not measured well. Further tests arerequired to see whether this performance is repeated every time, therefore the filter is run with the sameparameters twice more and the results from which are presented in table 4.2.3. It is important to note thatbetween each run of the filter the memory of the computer being used is cleared and the only informationthat is constant throughout is the OU process. The results show that the second run did not prove tobe promising and the third run gave a better value of but a worse value of compared to the firstsimulation. Therefore further testing is required to understand whether the number of ensemble membersis sufficient or whether further research into ensemble filters needs to be pursued.

We can see from figure 4.2.1 that the parameters settle down very quickly and for further testing we will

15


19/52

Fig. 4.2.2: Examining the first 1000 points of the non-observed parameter estimating from the EnKF with 100ensemble members. The persistence model has an effect within this region however the memberscollapse to one value which leads to a constant value.

2 -20.4883

2 20.9569

Time taken2 315.764881s

3 16.5418

3 158.1899


Table 4.2.3: Further results for the EnKF with 100 ensemble members. This shows that there is no level of

consistency in estimating and.

subsample the OU process to add an extra element of uncertainty in order to understand the robustness ofthe filter. Henceforth we shall use every 10th sample from the stochastic process. In doing so we will needto modify our value of t which has been taken into account in the subsequent tests. Another importantmeasure is to understand how the filter operates with 100 ensemble members with subsampling. Havingrun the test three times we obtain averages of the parameters as seen in table 4.2.4.

average 30.4354

average 75.2114

Total timeaverage 29.30938s

Table 4.2.4: An example of the filter operating when the benchmark process is subsampled.

250 ensemble Members In this section we will look at how the filter works with subsampling andthe use of 250 ensemble members.

16


20/52

1 5.4461

1 108.6679


2 11.2516

2 147.8386

Time taken2 107.158336

3 11.1496

3 162.4759Time taken3 105.324963

Table 4.2.5: Results of the EnKF with 250 ensemble members. This shows that there is an increased level ofaccuracy by using more members however this comes at an added computational cost.

We can see in table 4.2.5 that by increasing the amount of ensemble members we see an increase in theability of the filter to estimate the parameters although we do see the trade off between computationaltime.

500 ensemble Members We will now consider the case where there are 500 ensemble members.

Table 4.2.6 shows that the filter works better when the amount of members is increased, however the costof doubling the members from the case of 250 ensembles is that the time is more than quadrupled. Thisdemonstrates the payoff between time taken and the accuracy required.

1 7.4670

1 166.6105


2 11.4385

2 161.7368

Time taken2 476.574365

3 14.0925

3 171.1054Time taken3 614.160173

Table 4.2.6: Results from the EnKF with 500 ensemble members. This shows a greater level of accuracy overthe cases with 250 and 100 ensemble members however the computational cost is almost 4 timesthat of the case with 250 members.

Added noise to and The problem with trying to accomplish further accuracy is that ourpersistence model causes the parameters to settle to a value and the filter stops believing the data thatit is receiving; the covariance for both and is equal to 0. In applying this to the problem of AF, thiswill be a problem as we are hoping to look for any changes to the parameters incoming from the ECG. If

the model stops believing the data incoming then it will not be able to detect any changes once the valuehas settled. So far we have run the persistence models without artificial Wiener processes; if we are torun the filter again with the parameters C = 0.5 and C = 1.0 we can examine whether doing this willimprove the accuracy of the method and prevent the members from settling to one value. The result isshown in figure 4.2.3.

We can see that while the values no longer settle to a constant value that the added uncertainty is notlarge enough for . If we increase C to 10.0, the Levy process is now sufficient to cause to fluctuatearound the actual value as we can see in figure 4.2.4.

17


21/52

Fig. 4.2.3: This demonstrates the parameter estimation used 500 ensemble members and adding Wienerprocesses to the persistence models where C = 0.5 and C = 1.0. This prevents the persistence

model from settling to one value although the estimation for is not sufficient.

Fig. 4.2.4: Parameter estimation using 500 ensemble members and added Wiener processes to the persistencemodel with the parameters C = 0.5 andC = 10.0. This shows that the added noise has an impacton the estimation of however it causes the signal to fluctuate around the mean of the value.

Now that we are a step closer to being able to approach the entire problem and include in our parameterspace. Running the simulation with the same parameters used to create the filter for figure 4.2.4 and nowincluding a Wiener process for the equation with C = 10.0 we are able to obtain figure 4.2.5. We cansee that the method still works well for and , although the latter of which has more fluctuations aroundthe mean. The plot performs very poorly; it fluctuates around a mean of roughly -200 which is very faroff from the 500 used to generate the process. We need to begin to look at alternatives or modifications tothe EnKF to ascertain whether the performance of the filter can be improved. It is stated in (Yang andDelSole, 2009) that while the method of augmenting our state space to include the parameters works well,

18


22/52

Fig. 4.2.5: Estimating all parameters using 500 ensemble members and added Wiener processes where C =0.5 and C = C = 10.0. This shows that method does allow for the estimation for and to

fluctuate around the true value however the estimation for fails.

we also need to also consider how stable it is. It is stated that problems can arise when the parametersare multiplicative and can cause the model to become dynamically unstable. Based on this we will nowconsider the ensemble square root filter which will be discussed in section 4.3.

4.3 Ensemble Square Root Filter

This subsection will deal with a modification to the EnKF called the ensemble square root filter whichapplies the update to the perturbations from the mean as opposed to the member itself. We will also goon to discuss temporal smoothing and artificial covariance inflation methods.

The ensemble square root filter (EnSRF) is modified form of the ensemble Kalman filter. We proposesplitting the state space vector into two vectors: an m-dimensional vector specifying the forecast state(xf) and a q dimensional vector specifying the uncertain model parameters (bf). This requires, therefore,a change of some of the equations used in the EnKF. In the EnKF we performed the update equation oneach individual ensemble; in this case we will apply the Kalman update to the mean of these ensemblesand then propagate the fluctuations as opposed to each individual member. We will redefine our operationequation as

zk = Hxxfk + vk (4.3.1)

Where Hx is a measurement operator that maps the forecast state to the observation and vk we havedefined previously; white noise with mean 0 and covariance R. We can then define our augmented state

vector xf

as:

xfk =

xfkbfk

(4.3.2)And an augmented measurement operator as

H =

Hx 0

(4.3.3)

19


23/52

Like the EnKF we will require a matrix of ensemble fluctuations. We defined this in equation 4.2.5however we will now require another equation for the model parameters.

Bfk = B

fk (I 1N) (4.3.4)

Where Bfk = (bf1 , b

f2 ,..., b

fN). Now that we have the matrices of fluctuations defined we can now assemble

our error covariance matrix in a way similar to the EnKF although it will now require 3 steps. We will

require the following 3 matrices (note the subscript k has been dropped for this notation):

Pxx =1

N 1 XfX

fT(4.3.5)

Pbx =1

N 1 BfX

fT(4.3.6)

Pbb =1

N 1 BfB

fT(4.3.7)

(4.3.8)

Hence

Pf = Pxx P

Tbx

Pbx

Pbb (4.3.9)

We will now require two Kalman gain matrices:

Kxk = PxxHTx

HxPxxH

Tx + R

1(4.3.10)

Kbk = PbxHTx

HxPxxH

Tx + R

1(4.3.11)

Our update now is two fold. We have our regular Kalman update applied to the means of xfk and bfk .

xak = xfk + Kxk

zk Hxxfk

(4.3.12)

bak = bfk + Kbk

zk Hxbfk(4.3.13)

and finally assuming that our observations are independent we can follow (Whitaker and Hamill, 2002)and apply the following updates to the fluctuations from the mean:

xaj = x

fj KxHxxfj (4.3.14)

baj = b

fj KbHxbfj (4.3.15)

Where

=

1 +

R

HxPxxHTx + R

1(4.3.16)

And we can see that 0.5

< 1.

20


24/52

4.3.1 Algorithm

Algorithm 4.3.1 Ensemble Square Root Filter Algorithm

define N {number of members}distribute N normally with a chosen variance

for k = 1 to T do

for i = 1 to N doxfi,k = f(x

ai,k1, 0)

bfi,k = g(b

ai,k1, 0) {g is generally a persistence equation}

end for

Xfk = X

fk (1 1N)

Bfk = B

fk (1 1N)

Pxx =1

N1XfkX

fTk

Pbx

= 1N1

Bf

kX

fT

kPbb =

1N1B

fkB

fTk

P =

Pxx P

Tbx

Pbx Pbb

Kxk = PxxH

Tx

HxPxxH

Tx + R

1

Kbk = PbxHTx

HxPxxH

Tx + R

1

xak = xfk + Kbk

zk Hxxfk

bak = b

fk + Kbk

zk Hxxfk

= 1 + RHxPxxH

Tx+R

1for j = 1 to N do

xaj,k = x

fj,k KxHxxfj,k

baj,k = b

fj,k KbHxbfj,k

xaj,k = xaj,k + x

aj,k

baj,k = baj,k + b

aj,k

end for

end for

4.3.2 Application

As we have done previously, we will begin by looking at the application of the EnSRF to xt, and while assuming that is known to understand whether the additional steps in the EnSRF will providebetter performance than the EnKF. We will examine the case where we have 250 ensemble members and500 ensemble members. In both examples we will include the additional Wiener processes using the sameparameters as discussed previously: C = 0.5 and C = 10.0.

21


25/52

250 ensemble members Running the filter with 250 ensemble members results in figure 4.3.1. As wecan see the method does not settle and the values do indeed tend to the real values, although seems tobe undershooting. Running the filter three times results in the average figures displayed in table 4.3.1. Ineach run we take a time average and then we have then the average of all the runs. Already we can see

Fig. 4.3.1: Using the ensemble square root filter with 250 ensemble members to estimate and . Thepersistence model still contains the added noise. This shows that this method improves the accuracycompared to the EnKF.

11.2213 178.6549Time takenaverage 229.281044s

Table 4.3.1: Average results for 250 ensemble members in the EnSRF

that there is some improvement over the EnKF; the average value of is improved although again this isat the cost of the time taken to run each simulation; we now are doubling the processing time due to theincreased amount of calculations required.

500 ensemble members We will now examine the method when using 500 ensemble members. Wewill use the same method as described for 250 ensemble members. Figure 4.3.2 shows an improvement

over 250 ensemble members with all values fluctuating closer to actual value. The average results arepresented in table 4.3.2.

The results are very promising. As we can see, we have improved the result of all the 2 parameters andas such it is now time to see how well this filter performs with added to this case.

500 ensemble members estimating , and We will now consider the case where we havethe added Wiener process to the persistence equation: we will set C = 10.0 and examine how well

22


26/52

Fig. 4.3.2: Estimating and using 500 ensemble members in the EnSRF with added noise to the persistencemodel.

10.065

207.6561

Time takenaverage 1134.1052s

Table 4.3.2: Estimating and using 500 ensemble members in the EnSRF with added noise to the persis-tence model.

this performs. Figure 4.3.3 shows the performance of all 3 runs of the filter. The numerical results arepresented in table 4.3.3.

11.0981

194.2767

-53.2441


Table 4.3.3: Average values from the EnSRF estimating all parameters in the OU process with 500 ensemblemembers with added noise to the persistence model applied.

The results of this show that the extra cost of estimating is minimal however the filter fails at estimatingthe parameter. While the diffusion parameters ( and ) are estimated fairly well, the estimation for thedrift term is poor: two of the three runs result in a negative value and none are close to the actual value.Therefore further research is still required.

23


27/52

Fig. 4.3.3: EnSRF with 500 ensemble members and added noise. The top and middle graphs demonstratethe ability of the filter to estimate the diffusion terms in the OU process. The bottom graph showsthat the filter still fails to effectively estimate .

Temporal Smoothing It is suggested in (Yang and DelSole, 2009) that the persistence model can be

modified for use by using temporal smoothing. The result of using the smoothing is that it can mitigatemodel blow-up for a small number of ensembles. The modification required is demonstrated in equation4.3.17.

bfk+1,j = (1 )bak,j + bfk,j (4.3.17)

It is suggested that = 0.8 for the most effective result. We will consider the application of this usingthe same parameters as described so far.

As we can see in figure 4.3.4 and table 4.3.4 the result of the temporal smoothing is indeed beneficial. Weare able to obtain accuracy close to the case where we have 500 members thus reducing the computationaltime taken. An interesting note, however, is the instability that occurs in the third run of the filter; afterabout 8

104 we see starting to blow up and this is also reflected in the other two plots. Regardless

of this, the other two attempts seem to be successful. This may have occurred due to a spurious pointbeing generated in the normal distribution However we still encounter the problem with estimating estimation.

Previously we could see a definite increase in accuracy by using 500 ensemble members compared to 250,however this is no longer the case as shown by table 4.3.5. With temporal smoothing applied there is verylittle different between both cases for and . We can see that the amount of members has little to noeffect on estimating . Thus we can infer that by applying the temporal smoothing we improve accuracyfor a smaller ensemble thus decreasing the computational cost. Henceforth we will consider the case of

24


28/52

Fig. 4.3.4: The application of temporal smoothing to the EnSRF with 250 ensemble members with addednoise. The top () and middle () graphs demonstrate estimation comparative to the case of 500ensemble members. The bottom graph () still fails to be estimated correctly.

9.9926

197.6678

39.2513


Table 4.3.4: Average results from EnSRF with 250 members, temporal smoothing and added noise. Thisshows that the overall accuracy of the filter is increased by using temporal smoothing comparedto the case with 500 members and no smoothing applied. continues to b estimated poorly.

250 ensemble members.

Artificial Covariance Inflation We have already discussed why it is important for the model param-eters to have some form of artificial noise or inflation to prevent the filter from settling to one value aswe require the filter to be able to adapt to changes that may occur to the patient during the surgical

procedure. We have already proposed one method of artificial inflation which is by introducing additionalWiener processes to the persistence models of the parameters we are trying to estimate. An alternativeto this is proposed by (Anderson, 2007). This involves applying an inflation coefficient to the updateequation of the parameters of the model. This takes the form of

binfj,k =

b(bj,k bk) + bk (4.3.18)

It is suggested that b should be incorporated into the state space vector however for simplicity we willattempt to tune the value to understand how this method works. We will adapt different values of for and for the case in hand. We will begin by removing the Wiener processes for the persistence

25


29/52

9.3593

197.4367

-62.6541


Table 4.3.5: Average results from EnSRF with 500 members, temporal smoothing and added noise. Thisshows little-to-no improvement over the case with 250 ensemble members for the diffusion pa-rameters. continues to b estimated poorly.

model and instead apply the inflation. This inflation will occur prior to the persistence model. As wehave failed to estimate so far, we will assume that it is known and once again simplify the problemfor understanding whether this extra step is viable. We will begin with the parameters = 1.02 and = 1.01. The result of this is shown in figure 4.3.5.

Fig. 4.3.5: Application of artificial covariance inflation instead of added noise to the EnSRF with 250 en-semble members to estimate (top) and (bottom).

average 9.6066

average 196.8571


Table 4.3.6: Average data for EnSRF with 250 ensemble members, artificial inflation, temporal smoothingand no estimation. This shows that this method is comparable to the method of adding noiseto the unobserved members.

We can see in the artificial inflation increases the covariance of both and . An interesting note isthat during the third run of the estimation we can see that there is a very large amount of uncertaintyalthough the filter does settle down. This may be due to the initial distribution of the members in the

26


30/52

filter however despite this we see that the severity of the fluctuations reduces. While this value doessettle to a mean we see that this does increase the amount of time it takes for to settle which shouldbe expected. If controls how quickly the signal is pulled back to the mean then the mean does indeedneed to be discovered first. The average data is presented in table 4.3.6. We can see that this methodalso works well as an alternative to the additional Wiener processes. The most important issue, however,is in estimating . We will now consider alternative methods to estimating this parameter.

27


31/52

4.4 Estimating

In this subsection we will discuss the problems in estimating the drift parameter in an SDE and proposea Bayesian method for estimation purposes.

We have seen that estimating seems to be more complicated than estimating and . The reasonfor this was unknown, however it is suggested in (DelSole and Yang, 2009) that as the initial covariancebetween xt and t vanish, then

at remains constant and equal to the initial guess of .

3

So far we have been using Kalman filter methods to estimate the diffusion parameter by augmenting ourstate space vector. However the proposed method involves having two different vectors; an augmentedvectors = [] and x = [x, b]T = [xt, , ]

T. If we revert to Bayes theorem:

p(x|z) p(z|x)p(x|)p() (4.4.1)

And by defining the following

p(z|x) = (2)M2 |R| 12 e(zHx)R1(zHx) (4.4.2)

And defining p(x|) which is the prior distribution which we will assume is Gaussian with mean x and

covariance P. The final distribution to be defined in equation 4.4.1 is p() which is also assumed to beGaussian with mean and covariance . By taking the log of equation 4.4.1 and multiplying it by -2we obtain the following

2L = ( )T1 ( ) + ln |P| + (z Hx)TR1(z Hx) + (x x)TP1(x x) (4.4.3)

By differentiating equation 4.4.3 with respects to x we are able to obtain the Kalman filter updateequation. However if we differentiate with respects to and by defining the following:

P1

j= P1 P

jP1

ln |P|j

= tr[P1 Pj

]

We can obtain an equivalent update equation for 4:

= 12

tr

P1

P

+

1

2(z Hx)T(R + HP HT)1HP

HT(R + HP HT)1(z Hx) (4.4.4)

The implication is that this is an iterative method, however it is suggested that only one iteration isrequired per time step. This equation introduces an extra term which requires evaluation: P

. In order

to evaluate this we require two simultaneous ensembles to calculate P() and P( ). Once we havethese two covariance matrices we can then define the term using a first order approximation.

P

=

P() P( )

(4.4.5)

In doing this it is implied that we will be able to have a better estimate of . This method assumes that is constant which is where the method differs from the Kalman filter.

3For the full proof please refer to appendix B4For a full derivation please refer to (DelSole and Yang, 2009)

28


32/52

4.4.1 Algorithm

This section describes the procedure to implement a filter that estimates along with all other parameters.

Algorithm 4.4.1 Hybrid filter using an EnKF for xt, and and a Bayesian estimator for

define N {number of members}distribute N normally with a chosen variancedefine and

for k = 1 to T dofor i = 1 to N do

xfk,i = f(x

ak1,i, b

ak1,i, k1)

xfk,i = f(x

ak1,i, b

ak1,i, ( )k1) {This value is used to calculate P( )}

bfk,i = g(x

ak1,i, b

ak1,i, k1)

end for

Xfk = X

fk (1 1N) Xfk = X

fk (1 1N) Bfk = Bfk (1 1N)

Pxx

= 1N1

Xf

kX

fT

kPxx

= 1N1

X

f

kX

fT

kPbx =

1N1B

fkX

fTk Pbx =

1N1X

f

kB

fTk

Pbb =1

N1BfkB

fTk

P() =

Pxx P

Tbx

Pbx Pbb

P( ) =

Pxx P

Tbx

Pbx Pbb

P =

P()P()

k = k1 12 tr

P1 P

+ 12 (z Hx)T(R + HPHT)1HP HT(R + HPHT)1(z Hx)

Kxk = PxxHTx

HxPxxH

Tx + R

1

Kbk = PbxHTx HxPxxHTx + R

1

xak = xfk + Kbk

zk Hxxfk

bak = b

fk + Kbk

zk Hxxfk

=

1 +

R

HxPxxHTx+R

1

for j = 1 to N dox

aj,k = x

fj,k KxHxxfj,k

baj,k = b

fj,k KbHxbfj,k

xaj,k = x

aj,k + x

aj,k

baj,k = baj,k + b

aj,k

end for

end for

29


33/52

Fig. 4.4.1: Using the proposed Bayesian approach to estimating. This has been taken from a simplified caseof only estimatingxt and. This demonstrates that the method does not seem to work.

4.4.2 Application

We will now consider the case where we can estimate . For testing purposes we will simplify the problem

at hand and assume that and are known quantities. We shall begin by choosing values for and . We will aim for a high variance of 2 = 1 102 and = 0.01. We will begin from an initial guessof 0 = 0.0.

From figure 4.4.1 we can see that this process does not work, however this may be down to the parametersthat we have defined for the estimation process. We can see that the figure shows that the method is veryslow to adapt to the information coming in. This can, however, be solved by increasing the set covarianceof the process. We will now consider the case where = 2.5 102 and = 1 103. The result ismore promising. An initial run provides a mean value of as 668.3535. While this is still not near the500 target we have used to generate the OU it is some form of progress in achieving the goal set out. Thefilter was run twice more and the results are now presented in figure 4.4.2. An important note is that we

have not benchmarked the time taken for this task as we are interested in looking at the performance ofthe method as opposed to the efficiency.

Fig. 4.4.2: Further simplified tests with the and changed. This demonstrates that there is some levelof consistency compared to the Kalman filtering approach to estimating the drift parameter.

We can see that again while the filter is not perfect we at least have some level of consistency inobtaining a value whereas in previous attempts we did not have any form of consistency and therefore wewill continue down the path of estimating this parameter using the outlined method.

30


34/52


35/52

estimation may not be serendipitous as we saw in the estimation in figure 4.2.1. The result is shown infigure 4.4.4.

Fig. 4.4.4: Hybrid filter with 3 separate runs. 250 ensemble members have been used as well as artificialinflation and temporal smoothing. This demonstrates an improvement over the traditional Kalmanfilter methods for estimating the drift parameter.

9.6139 184.1664

403.8067


Table 4.4.2: Average results for the hybrid filter with 250 members, inflation and smoothing applied.

An interesting note on this filter is that on average it is computationally less expensive than runningthe EnSRF with all parameters being estimated. This is because we are essentially only using one mem-ber for as opposed to 250 or 500 as we were using previously.Now that we have a functional method for estimating the parameters in the OU process we will nowconsider fine tuning the filter and also ensuring that it is indeed suitable for the task of estimating data

from a surgery.

4.4.3 Changes in Parameters

We have previously only considered the case where the parameters of the process are constant. Theultimate goal of this filter is that it can be applied to data during the surgery and look for changes thatmay occur during the ablation process. This will most likely be shown by a change in either or ; wewill assume will remain constant unless the surgeon moves the electrode. We will now consider the case

32


36/52

where we merge two OU processes and attempt to see whether the filter does indeed adapt to changes inthe signal. We will generate two OU processes with the parameters defined in table 4.4.3.

1 0 2 0

1 100 2 250

1 500 2 700

Number of points per process 5 105

Time per process 50st 1 104

Table 4.4.3: Parameters for separate OU processes to be stitched together to examine the ability for the filterto adapt to changes.

We can see that from figure 4.4.5 that the filter is able to adapt to changes in parameters. Althoughas we started earlier is still not perfect we can see some shift in the parameter. It is possible to reducethe fluctuations of the plot however the cost of doing this is the increase in time it takes for the filter toadapt to any changes.

Fig. 4.4.5: This demonstrates the change of parameters in the OU process. This demonstrates that the filteris able to adapt to changes in (middle). The filter is also able to sense a change in (bottom)however the method still demonstrates undershoot.

4.4.4 Robustness

In this section we will discuss how robust the filter is. We have already discussed the reasoning behindusing 250 ensemble members over 500 for computational cost however we will now consider the case wherewe examine the initial variance of the ensemble. Throughout these tests we have been using a variance of1109 however we should consider how the filter performs with different variances. For our initial case wewill consider the following initial variances: 1 101, 1 104, 1 109. We found that 1 104 provides thebest case for converging. We found that 1 109 requires almost 50 seconds worth of information beforeit is able to correct itself whereas the other two cases require less than 10 seconds of information to beable to correct themselves.

33


37/52

5 Application to Heart Surgery Data

In this section we will discuss the application of the filtering techniques proposed to the ECG data.

Now that we have the tools required, we can begin analysing data from the surgical procedure. Wehave discussed how we can get a rough estimate of and fairly good estimates of and and how thefilter can now adapt to changes in parameter that may occur during the surgery. Our final test is to

now use the data from a surgical procedure to test whether the filter works. Throughout this section wewill benchmark the filter against a maximum likelihood estimator. We will first consider the case wherewe have long standing persistent (LSP) atrial fibrillation. As we can see in figure 5.0.1 there is a majordifference when we try to extract heartbeats from the data. In figure 5.0.1a we can see that beginsto fail; this may be due to the presence of a heartbeat skewing the data. In figure 5.0.1b we see someinteresting features; begins to change drastically towards the end of the data. While this may lookas though it is about to diverge this is not the case; if we were to superimpose the data from the ECGand we will see that does indeed track the data from the ECG. It seems as though the filter believesthat the mean is moving. This can be seen in figure 5.0.2. We also present the comparison between themaximum likelihood estimator and the filtered data in table 5.0.2.

(a) Unfiltered Data (b) Filtered Data

Fig. 5.0.1: Unfiltered and filtered data from LSP through the hybrid filter. (a) demonstrates the filter perfor-mance without the heartbeat. This shows that the OU process parameters could be changing afterevery heartbeat. (b) demonstrates the filter once the data has been modified.

34


38/52

Fig. 5.0.2: and ECG data superimposed from data without heartbeats. This shows that seems to betracking the data implying the parameters could be changing after each heartbeat.

Maximum Likelihood Filter

-96.0205 -144.0886

12.6478 12.8961

973.6496 800.0923

Table 5.0.1: Comparison of the maximum likelihood estimator to the filter

We will examine another case where we use data from a patient suffering from persistent atrial fibril-lation (PsAF). The data is measured by the III channel. By removing the majority of the data where theheart beats and running it through the filter we are able to obtain the information in figure 5.0.3.

Fig. 5.0.3: Persistent Atrial Fibrillation Channel III data passed through the hybrid filter. The bottom graph() also confirms that the parameters of the OU process are changing after each heartbeat.

35


39/52

Maximum Likelihood Filter

-13.9613 66.8015

27.4558 18.9184

2.6046103 1.4051103

Table 5.0.2: Comparison of the maximum likelihood estimator to the filter

We can see that there are some major discrepancies between the likelihood estimator and the filter.This should be a cause for concern however the reason for the discrepancies exists because of the existenceof some information from the heartbeat being left over in the modified data. We can see the impact thishas on , whenever a jump occurs corresponds to where some artefact of the heartbeat has remained.The solution to this is to improve the method used to remove the heartbeat though this requires muchmore manual adjustment of the data than is being used at the moment.

36


40/52

6 Final Remarks

6.1 Limitations

In this section we will discuss the limitations of the methods proposed.

We have discussed the different versions of the Kalman filter with a lot of focus on the ensemblesquare root filter which in itself is a variant of the ensemble Kalman filter. We have tested each method

with the same Ornstein-Uhlenbeck process and throughout this project we have discovered that while wehave a method that can estimate all the parameters, we have discovered that there are limitations to thismethod.

ECG data As we concluded in section 5 there are limitations in the data from the ECG. We havediscussed that we are looking at the signal between each heartbeat and running this data through thefilter has proven to be harder than originally provisioned. The method used to filter out the heartbeatwas done by creating a test for monotonicity between a set amount of points and then going back throughthe data and removing the data around the monotone area. This proved to work for a small amount ofthe data though due to the variety of data a better method for removing the heartbeat is required. Evena small artefact of the heartbeat can have a major impact on all the parameters.

Model Covariance Error Our first limitation is that we need to make an assumption about the modelerror covariance. Throughout this project we have assumed that R2 = 0.01 however this is a parameterthat will be almost impossible to accurately determine and therefore we have to make an assumptionabout this.

Fine Tuning A major issue with the filter is the level of fine tuning required by the user. We have testedartificial covariance inflation to prevent the data from settling to a known value. As we demonstrated thiswill allow for the filter to adapt to any incoming changes that may occur during a surgery such as whena surgeon has ablated parts of the patients heart. We have looked at using = = 1.01, howeverthis value has required a lot of fine tuning to obtain it. In many runs we discovered problems where

this method caused rank deficiency problems in the x ensemble, noticeably this occurred during the estimation. We had initially been forcing the method to be log-positive, however some spurious pointscould cause the method to fail thus this was removed which reduced the frequency of the rank problems.This was further improved by using a pseudo-inverse when the need required.Once the Bayesian approach for was implemented we began to suffer from further problems. We knowthat is reliant on P however when coupled with inflation we found that the model error covariancecould increase to significantly large numbers resulting in NaN (not a number) thus causing the methodto fail therefor requiring the values of which had been used previously to be tuned.

Bayesian Approach to Estimating As we have demonstrated, we have had issues with estimatingthe drift term in the OU process. While we have managed to gain some insight as to why we cannot

estimate it using conventional Kalman filtering techniques, the proposed method is not completely perfect.The method also requires a large amount of tuning to obtain the correct and values, however dueto the problems encountered with the ECG may require further testing to determine whether the valuesused are correct. The method also does not conform to the idea that we have been using in the filter; weassume that as more data comes into the filter we will be able to reduce our covariance of the parameterbeing estimated however in this case the method has one prescribed covariance which does not change.We also see that this method is still not perfect; the value is is normally about 25% below the requiredvalue although we discussed this is definite improvement over that of the EnKF and EnSRF.

37


41/52

Convergence Time Another limitation of the method proposed is the convergence time of some pa-rameters. We have noticed that is very quick to converge however both and can take a long time toconverge. We can understand the reason why takes longer than the mean; controls how fast the signalreverts to the mean therefore it makes sense that the mean is found before this parameter. However, inmany runs it was found that for some parameters it can take between 10 and 20 seconds before somevalues can converge. This can have a major impact on use during surgery depending on how it will beused. If it is to be used as a live tool then this may have an impact the duration of the surgery however as

a tool to be used after operation this will have a much lesser impact. We have used a maximum likelihoodestimator to obtain values of , and to then generate a corresponding OU process. We found that inmany cases it takes at least 20 seconds worth of information for the filter to fully converge. The problemis that each sample of the ECG is only 42 seconds so if any changes do indeed occur during the surgicalprocedure the filter may not be able to adapt quickly enough.

Ornstein-Uhlenbeck Process as a Model We have also briefly discussed the final validity of theOU process as a model for the heart signal. We have seen that the heartbeat can have a huge change onthe value of implying that a heartbeat itself can cause a change in the OU parameters.

6.2 Recommendations for Future Work

In this subsection we will discuss recommendations for further work for this project.

Further Filter Research further research into the filter could be conducted to obtain a better methodto estimating the drift term. The problem we face is that all the non-observed parameters are multiplica-tive which increases the complexity of the problem. It is possible that we could increase the amount ofparameters in our state space vector in the following form:

Xt = ( Xt) + (6.2.1) = Wt (6.2.2)

We would also include the parameters , and in our state space vector. In doing this we would turn from being multiplicative to additive which in turn reduces the error.If this does not prove to be successful further research into the Bayesian approach to estimating shouldbe performed. It would also be useful to understand how to approach this problem from a particle filterapproach although this would increase the computational cost and also increase the complexity of thealgorithm used. We could simplify the problem at hand by running 2 separate ensembles with two differentmethods; we could use the EnSRT for and and then introduce a particle filtering method for . Whilethe computational cost of running this filter might be expensive it could justify the added cost if we canimprove the accuracy of the method. We could implement various methods as described in (Simon, 2006)to improve the sample impoverishment that occurs with the particle filter.

Improved ECG Filtering Due to time constraints we were unable to create a robust method thatwould remove the heartbeat from the data from an ECG; we have seen that this is problematic when we

pass the data through our filter. The result is that the heartbeat has a large impact on which in turnhas an impact on the ability for the filter to estimate all the parameters correctly. However creating thistool would require a much more complex system that would need to detect when a heartbeat is about tooccur and stop recording the data for a given time.

Benchmark ECG data The aim of this filter is to be able to classify the ECG however it is necessaryto be able to benchmark the filter against a series of healthy patients prior to testing this on data frompatients that suffer from atrial fibrillation. In doing so we will be able to create a classification systemthat would allow the surgeon to understand the impact of the ablative surgery on the patient.

38


42/52

Bibliography

American Heart Association (2008), Atrial fibrillation (for professionals), http://www.americanheart.org/presenter.jhtml?identifier=1596. (Retrieved May 2010).

Anderson, J. L. (2007), An adaptive covariance inflation error correction algorithm for ensemble filters,Tellus A 59(2), 210224.

Calkins, H., Brugada, J., Packer, D. L., Cappato, R., Chen, S.-A. A., Crijns, H. J., Damiano, R. J., Davies,D. W., Haines, D. E., Haissaguerre, M., Iesaka, Y., Jackman, W., Jais, P., Kottkamp, H., Kuck, K.H. H., Lindsay, B. D., Marchlinski, F. E., McCarthy, P. M., Mont, J. L., Morady, F., Nademanee, K.,Natale, A., Pappone, C., Prystowsky, E., Raviele, A., Ruskin, J. N., Shemin, R. J., Heart Rhythm So-ciety, European Heart Rhythm Association, European Cardiac Arrhythmia Society, American Collegeof Cardiology, American Heart Association and Society of Thoracic Surgeons (2007), Hrs/ehra/ecasexpert consensus statement on catheter and surgical ablation of atrial fibrillation: recommendationsfor personnel, policy, procedures and follow-up. a report of the heart rhythm society (hrs) task forceon catheter and surgical ablation of atrial fibrillation developed in partnership with the european heartrhythm association (ehra) and the european cardiac arrhythmia society (ecas); in collaboration withthe american college of cardiology (acc), american heart association (aha), and the society of thoracic

surgeons (sts). endorsed and approved by the governing bodies of the american college of cardiology,the american heart association, the european cardiac arrhythmia society, the european heart rhythmassociation, the society of thoracic surgeons, and the heart rhythm society., Europace : Europeanpacing, arrhythmias, and cardiac electrophysiology : journal of the working groups on cardiac pacing,arrhythmias, and cardiac cellular electrophysiology of the European Society of Cardiology 9(6), 335379.URL: http://dx.doi.org/10.1093/europace/eum120

DelSole, T. and Yang, X. (2009), A bayesian method for estimating stochastic parameters. Submitted toPhysica D.

Evensen, G. (1992), Using the extended kalman filter with a multilayer quasi- geostrophic ocean model,J. Geophys. Res 97, 1790517924.

Evensen, G. (2006), Data Assimilation: The Ensemble Kalman Filter, Springer-Verlag New York, Inc.,Secaucus, NJ, USA.

Evensen, G. (2009), The ensemble kalman filter for combined state and parameter estimation, ControlSystems Magazine, IEEE 29(3), 82104.

Gartner, G. E. A., Hicks, J. W., Manzani, P. R., Andrade, D. V., Abe, A. S., Wang, T., Secor, S. M. andGarland Jr., T. (2010), Phylogeny, ecology, and heart position in snakes, Physiological and BiochemicalZoology 83(1), 4354.URL: http://www.journals.uchicago.edu/doi/abs/10.1086/648509

Iqbal, M. B., Taneja, A. K., Lip, G. Y. H. and Flather, M. (2005), Recent developments in atrial

fibrillation, British Medical Journal 330(7485), 238243.

Kalman, R. E. (1960), A new approach to linear filtering and prediction problems, Transactions of theASMEJournal of Basic Engineering 82(Series D), 3445.

Kloeden, P. E., Platen, E. and Schurz, H. (2003), Numerical solution of SDE through computer experi-ments, Universitext, corr. 3. print. edn, Springer, Berlin [u.a.].

Krul, A. (2008), Calibration of stochastic convenience yield models for crude oil using the kalman filter,Masters thesis, Delft University of Technology.

39
http://www.americanheart.org/presenter.jhtml?identifier=1596http://www.americanheart.org/presenter.jhtml?identifier=1596http://www.americanheart.org/presenter.jhtml?identifier=1596http://www.americanheart.org/presenter.jhtml?identifier=1596http://www.americanheart.org/presenter.jhtml?identifier=1596


43/52

Majda, A. J., Harlim, J. and Gershgorin, B. (2010), Mathematical strategies for filtering turbulentdynamical systems, Discrete and Continuous Dynamical Systems 27(2), 441486.

McGee, L. A. and Schmidt, S. F. (1985), Discovery of the kalman filter as a practical tool for aerospaceand industry, Technical Memorandum 86847, NASA.

Risken, H. (1996), The Fokker-Planck Equation, 2 edn, Springer.

Simon, D. (2006), Optimal State Estimation: Kalman, H Infinity, and Nonlinear Approaches, Wiley-Interscience.

Sivakumaren, S. (2009), The detection of heart arrhythmia in electrograms, Masters thesis, ImperialCollege London.

Stewart, S., Murphy, N., Walker, A., McGuire, A. and McMurray, J. J. V. (2004), Cost of an emergingepidemic: an economic analysis of atrial fibrillation in the uk cost of an emerging epidemic: an economicanalysis of atrial fibrillation in the uk cost of an emerging epidemic: an economic analysis of atrialfibrillation in the uk, Heart 90(3), 286292.

Uhlenbeck, G. E. and Ornstein, L. S. (1940), On the theory of brownian motion, Phys. Rev. 36(5), 823

841.van den Berg, T. (2007), Calibrating the ornstein-uhlenbeck model, http://www.sitmo.com/doc/Calibrating_the_Ornstein-Uhlenbeck_model . (Retrieved May 2010).

Welch, G. and Bishop, G. (2006), An Introduction to the Kalman Filter, Department of Computer Science,University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-3175.

Whitaker, J. S. and Hamill, T. M. (2002), Ensemble data assimilation without perturbed observations,MON. WEA. REV 130, 19131924.

Yang, X. and DelSole, T. (2009), Using the ensemble kalman filter to estimate multiplicative modelparameters, Tellus A 61(5), 601609.

40
http://www.sitmo.com/doc/Calibrating_the_Ornstein-Uhlenbeck_modelhttp://www.sitmo.com/doc/Calibrating_the_Ornstein-Uhlenbeck_modelhttp://www.sitmo.com/doc/Calibrating_the_Ornstein-Uhlenbeck_modelhttp://www.sitmo.com/doc/Calibrating_the_Ornstein-Uhlenbeck_modelhttp://www.sitmo.com/doc/Calibrating_the_Ornstein-Uhlenbeck_model


44/52

Appendix

A Solution to OU Process

Yt = Xt (A.1)Thus

dYt = dXt = Ytdt + dWt (A.2)Now applying a variation of parameters such that

Zt = Ytet (A.3)

dZt = Ytet + etdYt (A.4)

This removes the drift term from the equation (Yt):

dZt = Ytet + et(Ytdt + dWt)

= 0dt + etdWt (A.5)

Thus a solution to this problem can be found by integrating between t and s to obtain the following:

Zt = Zs +

ts

eu dWu (A.6)

By reverting back to Yt the expression becomes:

Yt = etZt

= e(ts)Ys + et

ts

eu dWu (A.7)

and finally the solution using Xt becomes:

Xt = + e(ts)(Xs ) + et t

s

eu dWu (A.8)

41


45/52

B Estimation Issue

(DelSole and Yang, 2009)For simplicity we will simplify the stochastic process to the following. We will assume that is a knownconstant.

xt = xt1 + dWt (B.1)

Hence our AR(1) model becomes

xft = xat1 + at1dWt (B.2)

And our persistence model remains asft =

at1 (B.3)

If we then use our the Kalman filter equations as described in section 3.1.1 then we obtain the requiredparameters in the filter update; the Kalman gain matrix and the error covariance matrices. In a simple2D state space where xt = [xt t]

T then our Kalman gain matrix becomes:

K =

var[xft ]

cov[xft , ft ]

1

var[xft ] + r(B.4)

Thus our update equation for becomes

at = ft +

cov[xft ,

ft ]

var[xft ] + R

(zt xt) (B.5)

This implies that the updated t is proportional to the covariance between xt and t. This can becomputed by visiting equation B.2.

cov[xft , ft ] = cov[x

at1,

ft ] + cov[

ft dWt,

ft ] (B.6)

The last term in the above equation can be computed

cov[ft dWt, ft ] = E

ft dWt

ft

E[ft ] (B.7)

= E

ft

ft E[ft ]E[dWt]= 0

This is because dWt is white noise and thus independent of ft . If apply this analysis to the persistence

equation we will obtaincov[xat1,

ft ] = cov[x

at1,

at1] (B.8)

Thus equation B.6 becomescov[xft ,

ft ] = cov[x

at1,

at1] (B.9)

Applying the update equation for the covariance matrix it is possible to obtain the following expression

cov[xat , at ] = cov[x

ft ,

ft ] R

R + var[xft ]

(B.10)

Hence

cov[xft , ft ] =

R

R + var[xft ]

cov[xft1,

ft1] (B.11)

We know thatR

R + var[xft ] 1 (B.12)

42


46/52

And thus we have the following bound

cov[xt, t] cov[x0, 0] (B.13)

For stability we require || < 1 thus cov[xt, t] 0 as t . The implication is that as cov[xft , ft ]tends to 0, t is ceases to be updated. This implies that t is constant and equal to the initial guess of0.

43


47/52

C Maximum Likelihood Estimator

(van den Berg, 2007)

function [mu,sigma,theta] = maxlikely(S,)

n = length(S)1;

Sx = s um( S (1:end

1) );

Sy = s um( S (2:end) );

Sxx = sum( S(1:end1).2 );Sxy = sum( S(1:end1).*S(2:end) );Syy = sum( S(2:end).2 );

mu = (Sy*Sxx Sx*Sxy) / ( n*(Sxx Sxy) (Sx2 Sx*Sy) );theta = log( (Sxy mu*Sx mu*S y + n*mu2) / (Sxx 2*mu*S x + n*mu2) ) / ;a = exp(theta*);sigma2 = (Syy 2*a*Sxy + a2*Sxx 2*mu*(1a)*(Sy a*Sx) + n*mu2*(1a)2)/n;sigma = sqrt(sigmah*2*theta/(1a2));

end

44


48/52

D Ornstein-Uhlenbeck Process MATLAB Code

function [Yt] = OU(T,N,mu,sigma,theta,Y0 )

%OU generates an OU process using the

%analytical solution

dt = T/N;

%generate vector distributed N(0,1)rnd = randn(1,N);

Yt = zeros(N,1);

Yt(1) = Y0;

for i = 2 : N

%AR(1) Analytical Solution

Yt(i) = Yt(i1)*exp(theta*dt) + mu*(1exp(theta*dt))+...sigma*sqrt((1exp(2*theta*dt))/(2*theta))*rnd(i);

end

end

45


49/52

E Final Filter MATLAB code

%% Initialization of filter and OU Process

close all

clear

clc

tic

%Number of dimensions

%NOTE: This should be the number of dimensions minus one as the diffusion

%parameter in the stochastic equation is not included in the state space

%vector

D = 3 ;

%Generate OU Process

mu = 10.0;

sigma = 1000.0;

theta = 650.0;

Mo = 1e6;

T = 100;dt = T/Mo;

%Number of ensemble members

N = 250;

% Call OU Process

Yt = OU(T,Mo,mu,sigma,theta,mu);

%Simulate measurement noise

R = 0.012;

%Artificial inflation parameters

lambdam = 1.02;

lambdat = 1.008;

Cmu = 0.0;

Cthe = 0.0;

%Temporal smoothing parame

nikhil chandaria - final year project

Documents