receding horizon control of hiv

19

Click here to load reader

Upload: john-david

Post on 15-Jun-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Receding Horizon Control of HIV

OPTIMAL CONTROL APPLICATIONS AND METHODSOptim. Control Appl. Meth. 2011; 32:681–699Published online 1 October 2010 in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/oca.969

Receding Horizon Control of HIV

John David1,2, Hien Tran2,∗,† and H. T. Banks2

1Department of Mathematics and Computer Science, The College of Wooster, Wooster, OH 44691, U.S.A.2Department of Mathematics and Center for Research in Scientific Computation, Box 8205,

North Carolina State University, Raleigh, NC 27695-8205, U.S.A.

SUMMARY

This paper describes a model of the immunologic response of the human immunodeficiency virus (HIV)in individuals. It then illustrates how a Receding Horizon Control (RHC) methodology can be used todrive the system to a stable equilibrium in which a strong immune response controls the viral load in theabsence of drug treatment. We also illustrate how this feedback methodology can overcome unplannedtreatment interruptions, inaccurate or incomplete data and imperfect model specification. We considerhow ideas from stochastic estimation can be used in conjunction with RHC to create a robust treatmentmethodology. We then consider the performance of this methodology over random simulations of thepreviously considered clinical conditions. Copyright � 2010 John Wiley & Sons, Ltd.

Received 22 December 2009; Revised 11 August 2010; Accepted 12 August 2010

KEY WORDS: HIV model; Receding Horizon Control; extended Kalman filter; state estimator

1. INTRODUCTION

There have been various studies illustrating the application of optimal control methodology to HIVdrug treatment. A large number of methodologies for the control of HIV are based on open-looptechniques. This involves pre-computing a control over the entire treatment time interval for agiven model and initial conditions, without later revising the treatment based on newly availableinformation. An example of these works can be seen in [1–6]. This technique may be inadequatefor a number of reasons. First, some unmodeled effect can disturb the system, thus rendering thetreatment schedule ineffective or, even worse, detrimental. Second, if a patient misses a dose, theopen-loop control is no longer optimal or even necessarily beneficial. This methodology also doesnot take advantage of the measurements taken from an individual. For these reasons, we considerin this paper feedback methodology, where the control depends on the current state of the system,in the context of the control of an HIV infection.

Some feedback techniques may be inappropriate or non-applicable to the problem of HIVcontrol. Many feedback techniques are based on linear plants or models. It is difficult to accuratelymodel HIV dynamics with a linear model, especially as the time scale expands and the number ofmodeled biological mechanisms increases. See [7] for an excellent review of how the complexityof these models has increased as understanding of disease dynamics has progressed. Techniquesbased around solutions of Ricatti equations may be augmented to solve nonlinear control problems;however, these techniques require that the model be of a certain form and can require frequentstate knowledge or estimation. For an example of this sort of technique applied to HIV control,

∗Correspondence to: Hien Tran, Department of Mathematics and Center for Research in Scientific Computation, Box8205, North Carolina State University, Raleigh, NC 27695-8205, U.S.A.

†E-mail: [email protected], [email protected]

Copyright � 2010 John Wiley & Sons, Ltd.

Page 2: Receding Horizon Control of HIV

682 J. DAVID, H. TRAN AND H. T. BANKS

see [8]. A potential solution involves using Bellman’s principle of optimality; however, this involvessolving a nonlinear partial differential equation which, in general, is computationally challenging.

RHC seeks to gain the benefits of a feedback control while utilizing the computational simplicityof a calculus of variations approach. RHC solves a finite horizon open-loop optimal control problemonline at each sampling instant. There are several design considerations in using this methodologyincluding the process model, the cost function to be minimized, the sampling period, the controlhorizon, and the method by which the state is obtained at each sampling instant. The method bywhich the state is obtained at each sampling instant is of special concern as it generally involvessome combination of model prediction and noisy measurements. For this problem, we will employstochastic estimation; in particular, the extended Kalman filter (EKF). An excellent survey paperrelated to the theory behind RHC is [9]. Another source for the multitude of industrial applicationsof this technique is [10]. This technique has been used several times in the context of HIVcontrol [11–14].

The work in [12–14] uses model predictive control (MPC) on the model described in [15].It includes compartments for healthy T-cells, infected T-cells, helper-independent and dependentCTLs and CTL precursors. It lumps the effects of reverse transcriptase inhibitors (RTIs) andprotease inhibitors (PIs). It also uses a normalize model, i.e. ‘the values of the states have notbeen adjusted to correspond to any actual data’ [13]. The basic control methodology in this workis similar to the methodology used here. This work extends upon previous work in applyingcontrol methodology to the problem of HIV treatment in several ways. First, this work considersa complex model of HIV dynamics that includes multi-drug therapy, multiple target cells and acompartment accounting for the virus-specific immune response. Not only do we consider complexmodel dynamics, but this model has also been validated against data including CD4 counts andviral loads of over a hundred individuals between the period of 1996 and 2004 as described in [16].The use of a validated model also increases the stiffness and ill-conditioning of the model creatingsignificantly more numerical challenges over a normalize model as seen in [17, 18]. The use ofa stochastic estimator, the EKF, as opposed to the deterministic estimator described in [8] allowsus to specifically account for both the uncertainty in the model and the data. Beyond consideringthe performance of this methodology in the context of noisy measurements, and inaccurate modelparameters, it also considers the performance of this control method when patients do not adhereto the drug schedule and when the initial condition of the patient is unknown and the estimatormust converge to the patient’s true condition.

The organization of this work is as follows. Section 2 discusses the model of HIV infectionstudied here. We then describe the specific RHC methodology that we have considered in Section 3.Numerical results for the RHC implementation are presented in Section 4. Situations in whichtreatments are missed and a state estimator is employed are presented in Sections 5 and 6. Thenumerical solutions to these two combined problems are illustrated in Section 7. Section 8 showshow RHC methodology can be used in situations in which model parameters are allowed to varyin some random manner.

2. HIV MODEL

A great deal of effort has gone into modeling the physiologic and immunologic response of HIVin individuals. For excellent reviews of the various types of modeling attempts, see e.g. [7, 19, 20].In our attempt to model the physiologic and immunologic response of the HIV in individuals,we will consider a variation of the model proposed in [19]. It accounts for a variation of drugefficacy based on target cell type. Note that the differentiation between target cells is based oncell type as opposed to cell location or longevity. A second chief component of this model isthe modeling of the bodies HIV-specific immune response. This aspect of the model is based ona Michaelis–Menten nonlinearity saturation as proposed by Bonhoeffer et al. [21]. This specificmodel has been studied in [3, 8, 22].

This model captures some of the chief characteristics of HIV infection that we would like toconsider. In addition to modeling the bodies, HIV-specific immune response and differentiating

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 3: Receding Horizon Control of HIV

RHC OF HIV 683

Table I. Values of parameters in the HIV model.

Parameter Value Unit Parameter Value Unit

�1 10.0 cells/mm3 day �2 31.98e−3 cells/mm3 dayd1 0.01 1/day d2 0.01 1/dayk1 8e−4 mm3/virionsday k2 0.1 mm3/virionsdaym1 0.01 mm3/cellsday m2 0.01 mm3/cellsday�1 1 virions/cells �2 1 virions/cells� 0.7 1/day c 13 1/dayf 0.34 — NT 100 virions/cells�E 1e−3 cells/mm3 day �E 0.1 1/daybE 0.3 1/day dE 0.25 1/dayKb 0.1 cells/mm3 Kd 0.5 cells/mm3

between types of target cells this model also accounts for multi-drug therapy, both RTIs andPIs. RTIs interfere with the viral RNA-to-DNA synthesis, thus blocking the introduction of viralDNA into the cells’ genetic information. Without this step, the cells will not produce new virus.PIs attack the virus at the stage at which infected cells assemble new viral cells, thus producingnon-infectious viral cells. An important physiological aspect of this model is its ability to repro-duce a low non-zero viral load in the presence of multi-drug therapy. This model also possessesmultiple stable equilibria including a viral dominant equilibrium and an immune response dominantequilibrium.

The dynamics of our HIV model are described by the system of differential equations

T1 = �1 −d1T1 −(1−�1)k1VI T1,

T2 = �2 −d2T2 −(1− f �1)k2VI T2,

T ∗1 = (1−�1)k1VI T1 −�T ∗

1 −m1 ET ∗1 ,

T ∗2 = (1− f �1)k2VI T2 −�T ∗

2 −m2 ET ∗2 ,

VI = (1−�2)NT �(T ∗1 +T ∗

2 )−[c+(1−�1)�1k1T1 +(1− f �1)�2k2T2]VI,

VNI = �2 NT �(T ∗1 +T ∗

2 )−cVN I,

E = �E +bET ∗

1 +T ∗2

T ∗1 +T ∗

2 +KbE −dE

T ∗1 +T ∗

2

T ∗1 +T ∗

2 +KdE −�E E .

(1)

The state variables are T1, the uninfected CD4+ T-cells; T2, the uninfected target cells of asecond kind; T ∗

1 , the infected T-cells; T ∗2 , the infected target cells of a second kind; VI, the

infectious virus; VNI, the non-infectious virus; and E , the immune effectors. We do not givespecific definitions of target cells of second kind and immune effectors. They could, for example,be related to macrophages and cytotoxic T-lymphocytes, respectively. Table I contains values formodel parameters. The predictive capabilities of this model were demonstrated in [16] with clinicaldata from over 100 individuals from the period between 1996 and 2004. See [3, 23] for a furtherdescription of the model and model parameters.

3. RHC METHODOLOGY

As stated before, RHC seeks to gain the benefits of a feedback control while utilizing the compu-tational simplicity of a calculus of variations approach. Thus, there are several basic componentsto the methodology. First, there are the model equations that we use to predict system behaviorand compute the optimal control. There is the estimation of the current state, which can be donewith a combination of measured data and predicted output. We will use the methodology describedin [17]; however, there are a variety of techniques to complete this step. There is also the specific

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 4: Receding Horizon Control of HIV

684 J. DAVID, H. TRAN AND H. T. BANKS

Figure 1. Schematic diagram of RHC methodology.

optimal control problem, or cost function one will seek to minimize. Finally, there is the method-ology by which the control horizon, the period over which each optimal control problem is solved,and the period over which this control is used are selected. Figure 1 is a schematic that illustratesthis general process.

To formalize this methodology, we will present the following mathematical formulation. We willuse notation for the RHC methodology that is similar to that presented in [24]. Let [ti ti+1] be asequence of time intervals. Let tch,i , such that tch,i�ti+1 − ti , be the control horizon. Consider thesequence of problems, Oi ,

minu

J (u)=∫ ti +tch,i

tiL(x,u, t)dt +�(x(ti + tch,i )), (2)

subject to

x = f (x,u), x(ti )= xi , (3)

on the interval [ti ti+1]. In order to be biologically realistic, we will also enforce bounds on thecontrol, umin�u(t)�umax. Note that for this problem the control is the efficacy of the drug doseand thus we will write u = [�1 �2].

The solution to each optimal control problem Oi can be analyzed and solved in the mannerpresented in [17]. Note that we used a quasi-Newton method BFGS. The code used in this problemcan be obtained from the web site http://www.math.ncsu.edu/~ctk. For more informationon this optimization algorithm, see [25].

The RHC methodology can be summarized as follows:

1. Given initial condition x(ti ), solve the control problem Oi on the time interval [ti ti + tch,i ].2. Use the control as it is defined on the interval [ti ti+1] to determine the trajectory on the

same time interval.3. Use observations or an estimator to determine x(ti+1).4. Repeat the process over starting at the next time interval ti+1.

This process is illustrated in Figure 2. This means that O0 was solved by minimizing the integralfrom t0 to T . However, the control was only used from t0 to t1. At this point, a new control wascomputed by solving O1 on the interval t1 to t1 +T . This control was then used from t1 to t2. Notethat neither the time intervals nor the control horizon need to be constant.

Note that our numerical routine to solve each control problem Oi is an iterative method, thusrequiring an initial iterate. The initial iterate for O0 was u0 =0.5[�1,max �2,max]. The initial iteratefor each subsequent Oi was the optimal solution to Oi−1. Two heuristic approaches were usedto facilitate numerical computations. One approach was how to choose the initial iterate when

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 5: Receding Horizon Control of HIV

RHC OF HIV 685

–5

–10

–15

–20

0

5

10Example of Receding Horizon Control

t0 t3t2t1 T

Control computed on [t0 T]

Control used on [t0 t

1]

Control computed on [t1 t

1+T]

Control used on [t1+t

2]

Figure 2. Example of RHC.

the immune response E was stimulated above 200cells/mm3. In this case, the initial iterate wasdecreased by a factor of 10 as this facilitated convergence to a low dose control that stimulatedfurther immune response. The other was in the case that the ode solver did not converge for acandidate control vector. In this case, u p was replaced with 0.99u p and the algorithm proceeded.The optimal control estimation is discussed further in [17] and the full optimality conditions aregiven in [18].

4. NUMERICAL RESULTS: PERFECT STATE KNOWLEDGE

For our study, we employed the following cost functional:

J (�1,�2) =∫ ti +tch,i

ti[QeV (t)+ R1�

21(t)+ R2�

22(t)−SE(t)]dt +�(E(ti + tch,i )−353.108)2, (4)

where Qe, R1, R2, S and � are weighting coefficients. The maximum values for the drug efficaciousfor the following simulations are �1,max =0.7 and �2,max =0.3. The results of [18] show that eachoptimal control problem Oi has a unique solution with the addition of one condition, that �, theweighting on the final state, is continuous. As � is a quadratic function of E , this condition holds.

Table II contains the values for the optimal control problem. The small values of weightingcoefficients are used so that the gradient values are of approximately the same magnitude as thecontrol bounds, thus improving the efficiency of the optimization.

In this section, we will assume that we have perfect knowledge of the state at each point inwhich a new control is computed. This is not physically realistic, but we will use the EKF toaddress this problem in Section 6. The initial condition for this problem is a healthy individualwith a small amount of free virus introduced to their system. This can be termed as acute infection.The specific values for this initial condition are

T1 = 1000cells/mm3, T2 =3.198cell/mm3,

T ∗1 = 10−5 cells/mm3, T ∗

2 =10−5 cell/mm3,

VI = 0.001virions/mm3, VNI =10−5 virions/mm3,

E = 0.01cells/mm3.

(5)

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 6: Receding Horizon Control of HIV

686 J. DAVID, H. TRAN AND H. T. BANKS

Table II. Parameters for RHC problem.

Parameter Value

ti 50i days

tch,i 50 days

Qe 10−9

R1 10−9

R2 10−9

S 10−5

� 5−4

0 500 10000

500

1000

T

0 500 10000

2

4T

0 500 10000

200

400

T

0 500 10000

1

2

3

T

0 500 10000

500

1000

1500V

0 500 10000

200

400

time (days)

E

0 200 400 600 800 10000

0.2

0.4

0.6

0.8

0 200 400 600 800 10000

0.1

0.2

0.3

0.4

time (days)

Figure 3. Model response and control when the control is allowed to vary daily.

Our goal is to transfer this infected individual to a healthy immune response dominated equilibrium.The specific values for this state are

T1 = 967.839cells/mm3, T2 =0.621cell/mm3,

T ∗1 = 0.076cells/mm3, T ∗

2 =0.006cell/mm3,

VI = 0.415virions/mm3, VNI =0virions/mm3,

E = 353.108cells/mm3.

(6)

This equilibrium has been studied in [3, 23]. We will merely note here that this is a stable equilibriumin the presence of no treatment. We have studied three cases for how often the control is allowedto change. The treatments are allowed to vary daily, every 5 days, and every 10 days. Note that asthe model and control are continuous function (as opposed to a discrete model), the control valuesused in-between specified treatment values are a linear interpolation.

Figures 3–6 show control and model response for each of these cases. The non-infectious virusVNI has been omitted in the figures as that compartment has no direct effect on any of the othercompartments. Figure 3 shows each one of the state variables as a function of time in days onthe left and the corresponding treatment protocol on the right. Note how the healthy cells declineduring the treatment interruptions and the free virus and infected cells rebound. The state variableE is the best indicator of the progression toward immune system dominance and can be seen in the

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 7: Receding Horizon Control of HIV

RHC OF HIV 687

3phase diagram phase diagram

2.5

2

1.5

1

0.5

–0.5

–2

–1.5

–1

–4 –3 –2 –1

log V

log

E

0 1 2 3 4

0

3

2.5

2

1.5

1

0.5

–0.5

–2

–1.5

–1

–4 –3 –2 –1

log V

log

E

0 1 2 3 4

0

Figure 4. Phase plot of log of immune effectors vs log of virus when control is allowed to vary daily onthe left. The results for when control was allowed to vary every 10 days are on the right.

0 500 10000

500

1000

T

0 500 10000

2

4T

0 500 10000

200

400

T

0 500 10000

1

2

T

0 500 10000

500

1000

1500V

0 500 10000

200

400

time (days)

E

0 200 400 600 800 10000

0.2

0.4

0.6

0.8

0 200 400 600 800 10000

0.1

0.2

0.3

0.4

time (days)

Figure 5. Model response and control when control is allowed to vary every 5 days.

0 500 10000

500

1000T

0 500 10000

2

4T

0 500 10000

200

400

T

0 500 10000

1

2

T

0 500 10000

500

1000

1500V

0 500 10000

200

400

time (days)

E

0 200 400 600 800 10000

0.2

0.4

0.6

0.8

0 200 400 600 800 10000

0.1

0.2

0.3

0.4

time (days)

Figure 6. Model response and control when control is allowed to vary every 10 days.

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 8: Receding Horizon Control of HIV

688 J. DAVID, H. TRAN AND H. T. BANKS

0 500 10000

500

1000

T

0 500 10000

0.5

1T

0 500 10000

100

200

T

0 500 10000

0.1

0.2

T

0 500 10000

500

1000V

0 500 10000

200

400

time (days)

E

0 200 400 600 800 10000

0.2

0.4

0.6

0.8

0 200 400 600 800 10000

0.1

0.2

0.3

0.4

time (days)

Figure 7. Model response and control when control is allowed to vary daily. Initialconditions are given by unhealthy steady state.

lower right box of all the state variables. Figure 4 shows the phase plot of the log of the immuneeffector as a function of the log of the virus. Note that in this sort of plot, time is an implicitvariable with only the two state variables being shown. This plot begins at acute infection and thusrelatively low levels of E and V . We can see how the viral load experiences one increase duringinfection and four increases during treatment interruptions. It is interesting to see that E decreasesduring most of the treatment interruption only to increase toward the end of the interruption.Note that each one of these treatment protocols has treatment interruptions under which the viralload and infected cells rebound, whereas the number of healthy cells drops. Also note that howoften the control is allowed to vary has a large effect on how long it takes the immune responseto be stimulated and for the individual to be transferred to the healthy steady state. When thecontrol is allowed to vary daily, the control is largely turned off after 400 days, whereas it isnot turned off until approximately 600 days when the control varies every 5 days and 900 dayswhen the control is allowed to vary every 10 days. The phase plots in Figure 4 also illustrate thedifference in system response when the control is only allowed to vary every 10 days. Note howmore interruptions are required and there is a good deal more time where the immune response isdeclining.

We would also like to determine whether this methodology can be used to transfer a patientfrom the unhealthy equilibrium to the healthy equilibrium. By unhealthy equilibrium, we meanthe state

T1 = 163.573cells/mm3, T2 =0.005cell/mm3,

T ∗1 = 11.945cells/mm3, T ∗

2 =0.046cell/mm3,

VI = 63.919virions/mm3, VNI =0virions/mm3,

E = 0.024cells/mm3.

(7)

This equilibrium represents a physical state in which the immune response has largely beendestroyed and a large quantity of infected cells and free virus is present in an individual. Note thatthis equilibrium is stable. Figures 7, 9 and 10 illustrate the model simulations and controls whenthe initial conditions are given by the unhealthy steady state. Figure 8 shows the phase plot of thelog of the immune effector as a function of the log of the virus. In each one of these situations,even when the control was only allowed to vary every 10 days, the RHC methodology was ableto transfer the system to the immune effector-dominated equilibrium.

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 9: Receding Horizon Control of HIV

RHC OF HIV 689

3phase diagram

2.5

2

1.5

1

0.5

–0.5

–1

–1.5

–2–1 –0.5 0 0.5

log10 V

log 1

0 E

1 1.5 2 2.5 3

0

Figure 8. Phase plot of log of immune effectors vs log of virus when control is allowed to vary daily andthe initial condition is the unhealthy equilibrium.

0 500 10000

500

1000

0 500 10000

0.5

1

0 500 10000

100

200

0 500 10000

0.1

0.2

0 500 10000

500

1000

0 500 10000

200

400

time (days)

0 200 400 600 800 10000

0.2

0.4

0.6

0.8

0 200 400 600 800 10000

0.1

0.2

0.3

0.4

time (days)

Figure 9. Model response and control when control is allowed to vary every 5 days. Initial conditions aregiven by unhealthy steady state.

0 500 10000

500

1000

0 500 10000

0.5

1

0 500 10000

50

100

150

0 500 10000

0.1

0.2

0 500 10000

500

1000

0 500 10000

200

400

time (days)

0 200 400 600 800 10000

0.2

0.4

0.6

0.8

0 200 400 600 800 10000

0.1

0.2

0.3

0.4

time (days)

Figure 10. Model response and control when control is allowed to vary every 10 days. Initial conditionsare given by unhealthy steady state.

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 10: Receding Horizon Control of HIV

690 J. DAVID, H. TRAN AND H. T. BANKS

5. UNSCHEDULED TREATMENT INTERRUPTION

A natural question to pose is what if a patient goes off medication? This can occur because of newinfections, drug side effects or a number of other reasons. Open-loop methodology has no way totake this into account and will most likely prove insufficient. Feedback controllers can take thistreatment lapse into account and adjust the controller. We will examine how RHC performs whena patient disregards the proposed treatment schedule and goes off treatment. For the followingsimulations, a patient went off treatment for 10 days after 300 days of RHC-based treatment. Otherthan this stopping of treatment, the experimental design is identical to that presented in Section 4.Figure 11 contains the results from this simulation. Note that this situation requires six periodsof treatment interruptions, whereas the identical situation in which the treatment regiment wasfollowed required only four treatment interruptions.

As the immune response is key to the healthy steady state, we should examine the effect thistreatment interruption has on the immune response. Figure 12 has a comparison of the immuneresponse when the control is allowed to vary daily, but in one case the patient adhered to thetreatment protocol whereas in the other case the patient went off treatment. First, it delays the

0 500 10000

500

1000

0 500 10000

2

4

0 500 10000

200

400

0 500 10000

1

2

3

0 500 10000

500

1000

1500

0 500 10000

200

400

time (days)

0 200 400 600 800 10000

0.2

0.4

0.6

0.8

0 200 400 600 800 10000

0.1

0.2

0.3

0.4

time (days)

Figure 11. Model response and control when control is allowed to vary daily and controlwas interrupted after 300 days for 10 days.

0 200 400 600 800 1000

70

60

50

40

30

20

10

0

-10300 350 400 450 500

0

50

100

150

200

250

300

350

400

time (days) time (days)

no treatment interruptionwith treatment interruption

no treatment interruptionwith treatment interruption

Figure 12. Immune response when control is allowed to vary daily. Comparison betweenfollowing treatment and when treatment was interrupted after 300 days for 10 days. Plot onthe left contains the full time interval, whereas the plot on the right focuses on the time period

directly after the patient went off medication.

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 11: Receding Horizon Control of HIV

RHC OF HIV 691

0 200 400 600 800 10000

50

100

150

200

250

300

350

400

time (days)

no treatment interruptionwith treatment interruption

250 300 350 400 450 500 550 600 650

0

20

40

60

80

100

time (days)

no treatment interruptionwith treatment interruption

Figure 13. Immune response when control is allowed to vary every 5 days. Comparison betweenfollowing treatment and when treatment was interrupted after 300 days for 10 days. Plot onthe left contains the full time interval, whereas the plot on the right focuses on the time period

directly after the patient went off medication.

stimulation of the immune response by approximately 120 days. However, it should also be notedthat the RHC-based treatment strategy was able to overcome the failure to take medication andstill transfer the patient to healthy steady state. Figure 13 repeats this situation when the control isallowed to vary every 5 days. Note that when the control can only vary every 5 days the stimulationof the immune response is delayed by approximately 180 days, a longer delay than when thecontrol can vary daily.

6. NUMERICAL RESULTS: STATE ESTIMATION

In this section, we will address the problem of implementing RHC methodology while relying onthe EKF to estimate the state each time a new control is implemented. RHC is a feedback controltechnique; thus it requires state knowledge. It does not require state knowledge continually as aRicatti equation technique does, but it does require an estimate of the state each time a new controlis computed. As we have a nonlinear model, we choose to employ the EKF. In particular, we willconsider a formulation based on a continuous process model and discrete time measurements. Thegeneral problem we want to consider is given a mathematical model for a physical system, x ,corrupted by some process noise, w, and a model for data measurements, z, also corrupted bysome noise, v, what is the best, in some sense, estimate for the true physical system.

Mathematically, the problem can be stated as

x = f (x, t)+g(t)w(t),

zk = h(x(tk),k)+vk,(8)

where x(t0) is normally distributed with mean x0 and covariance P0, w(t) and vk are white noiseprocesses uncorrelated with x(t0) and with each other and with means 0 and covariances Q and R,respectively. The formulation of this problem in this manner was presented in [26]. In the casewhere f and h are linear, the problem reduces to the famed Kalman filter (KF) [27]. Note that inthis type of estimation, we are trying to estimate statistics that describe a probability distributionon the true state of the process. In the KF, the optimal estimate is defined by a normal distributionand thus only the mean, x , and the covariance, P , are needed to completely define the conditionaldistribution of the state dependent on the data, p[x(t)|(zk)]. See [28] for an excellent introductionto the topic.

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 12: Receding Horizon Control of HIV

692 J. DAVID, H. TRAN AND H. T. BANKS

The EKF attempts to make an estimate of the true state with a ‘predictor–corrector’ sort ofimplementation. First in the ‘predictor’ stage of the algorithm where no data is available and thedynamics are defined by the model, the time update is employed.

˙x = f (x, t),

P = P∇ f (x)T +∇ f (x)P +gQgT.(9)

These equations are integrated between time points tk and tk+1. Then when new data are available,the ‘corrector’ stage or measurement update is employed

Kk = P−(tk)∇h(x)T[∇h(x)P−(tk)∇h(x)T + R]−1,

P(tk) = [I −Kk∇h(x)]P−(tk),

xk = x−k +Kk[zk −∇h(x)x−

k ].

(10)

Note that this method only gives us the mean and covariance. Since f and h are nonlinear,the conditional distribution is generally non-normal and would require perhaps infinitely manystatistics to fully describe it. Further, discussion on the issues related to implementing this filteron this particular model can be seen in [17, 18].

We will use simulated data created by the model, corrupted by observation noise. The onlyprocess noise we will assume is based on the accuracy of the integrator and this is assumed to besmall. Thus, we will choose Q =10−6 I . The form of the data we are going to use is

zk =

⎛⎜⎝

T1 +T ∗1

VI +VNI

E

⎞⎟⎠+rk, (11)

0 100 200 300 400 500 600 700 800 900 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 100 200 300 400 500 600 700 800 900 10000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

time (days)

noisy datafilter

VI+VNI

T1+T1*

noisy datafilter

time (days)

E

noisy datafilter

00

100 200 300 400 500 600 700 800 900 1000

200

400

600

800

1000

0 100 200 300 400

0

100

–100

200

300

400

500 600 700 800 900 1000

0 100 200 300 400 500 600 700 800 900 1000

0

200

–200

400

600

800

Figure 14. Control found when EKF is used as state estimator on the left. Performanceof EKF as a filter of noisy data on the right.

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 13: Receding Horizon Control of HIV

RHC OF HIV 693

where rk is normally distributed with mean 0 and covariance R. For the following experiments, Rwill be of the form

R =

⎛⎜⎝

r1 0 0

0 r2 0

0 0 r3

⎞⎟⎠ , (12)

where r1 =1000, r2 =1000 and r3 =10. The control was allowed to vary every 10 days. Data wascollected every 10 days. The initial state for the problem was the unhealthy steady state. Theinitial condition for the estimator was chosen as xEKF

0 =0.3�x true0 , where � is normally distributed

with mean 1 and standard deviation 1. Figures 14 and 15 show the results from this experiment.Note that the EKF is rather effective in filtering the noisy data. This allows for the estimate ofthe initial condition xi to the control problem, Oi , to be accurate. This ultimately allows theRHC methodology to be successful in transferring the system to the immune response dominantequilibrium.

0 500 10000

200

400

600

800

1000T1

0 500 10000

0.2

0.4

0.6

0.8T2

0 500 10000

50

100

150

200

T1*

0 500 10000

0.05

0.1

0.15

0.2

T2*

0 500 10000

200

400

600

800

1000

time (days)

EKF estimation vs. true state

VI

0 500 10000

100

200

300

400E

Figure 15. Performance of EKF as a state estimator. The dotted line denotes EKF estimation, whereas thesolid line represents true state.

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 14: Receding Horizon Control of HIV

694 J. DAVID, H. TRAN AND H. T. BANKS

7. STATE ESTIMATION AND UNSCHEDULED TREATMENT INTERRUPTIONS

An additional test of this methodology that we would like to consider is the case of using theEKF as a state estimator in addition to interrupting the treatment for 10 days after 300 days notin accordance with the proposed treatment schedule. Data are collected every 5 days, whereasthe control is allowed to vary every 5 days. The parameters for the observation noise for thefollowing experiment are r1 =1000, r2 =1000, and r3 =10. The initial condition for the esti-mator is chosen as xEKF

0 =0.3�x true0 , where � is normally distributed with mean 1 and standard

deviation 1. The plots for these simulations can be seen in Figures 16 and 17. These results areencouraging. We were able to transfer from unhealthy steady state to healthy in the presence ofunscheduled treatment interruption and noisy data collected every five days within approximately800 days.

8. INACCURATE MODEL PARAMETERS

Another natural question to pose especially when considering feedback control is how robustthis methodology is with respect to inaccurate models. As this or any other model is only anapproximation of the true physical/biological process, we will study the effectiveness of a dosingstrategy based on an inaccurate model. The experimental set up will remain the same as inSection 4 except for the fact that our treatment schedule will be based on an inaccurate model.What we mean by inaccurate model is that we will assume that the true system is governed byEquation (1) with model parameters (values in Table II) that vary randomly in each time interval. Incomputing the control we will not use these randomly perturbed parameters, but rather the values in

0 100 200 300 400 500 600 700 800 900 10000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 100 200 300 400 500 600 700 800 900 10000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

time (days)

noisy datafilter

VI+VNI

T1+T1*

noisy datafilter

time (days)

E

noisy datafilter

00

100 200 300 400 500 600 700 800 900 1000

200

400

600

800

1000

0 100 200 300 400

0

100

–100

200

300

400

500 600 700 800 900 1000

0 100 200 300 400 500 600 700 800 900 1000

0

200

–200

400

600

800

Figure 16. Control found when EKF is used as state estimator on the left. Performance of EKF as a filterof noisy data on the right. Unscheduled 10 day treatment lapse after 300 days.

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 15: Receding Horizon Control of HIV

RHC OF HIV 695

0 500 10000

200

400

600

800

1000

T1

T2

0 500 10000

0.2

0.4

0.6

0.8

0 500 10000

50

100

150

200

T1*

T2*

0 500 10000

0.1

0.2

0.3

0.4

0 500 10000

200

400

600

800

VI

EKF estimation vs. true state

time (days)

0 500 10000

100

200

300

400

E

true statefilter performance

Figure 17. Performance of EKF as a state estimator. Dotted line denotes EKF estimation,whereas solid line represents true state. Data were collected every 5 days. Unscheduled

10 days treatment lapse after 300 days.

Table II. The following list will further illustrate this experiment. Let q denote the vector of modelparameters.

1. Using the model, x = f (x,qtab,u), where qtab are the parameter values from the table, solveproblem Oi for optimal control ui .

2. Create the random set of parameters which will define the true state of the patient for the timeinterval [ti ti+1], using the formula (qpat) j = (�) j (qtab) j , where (·) j is the j th component ofa vector and each component of � is normally distributed with mean 1 and a given standarddeviation.

3. The model, x = f (x,qpat,u), is used to simulate the response of the patient on the given timeinterval.

4. The process is repeated on the following time interval.

For the model responses and controls in Figure 18, the standard deviation of (�) j is 0.05. Theresults when using inaccurate model parameters were far less promising than those for unscheduledtreatment interruptions of when using noisy data. To get the results in Figure 18, we had tofrequently change doses (daily) and the level of perturbation of the parameters had to be low.

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 16: Receding Horizon Control of HIV

696 J. DAVID, H. TRAN AND H. T. BANKS

0 500 10000

500

1000T

0 500 10000

0.5

1T

0 500 10000

100

200

T

0 500 10000

0.2

0.4

T

0 500 10000

500

1000

1500V

0 500 10000

200

400

600

time (days)

E

0 200 400 600 800 10000

0.2

0.4

0.6

0.8

1

0 200 400 600 800 10000

0.1

0.2

0.3

0.4

time (days)

2

Figure 18. Model response and control when control is allowed to vary daily andmodel parameters are randomly perturbed.

Experiments in which more noise was added to the parameters or a less frequent change in dosingwas allowed were generally unsuccessful in the transferring the system to healthy steady state.

This behavior should not entirely be unexpected. The results from the sensitivity analysis in [17]show that this model, especially when the immune response is stimulated, is much more sensitive tovariation in parameters than variation in initial conditions. Thus, we should expect our methodologyto overcome imperfect state knowledge much more readily than inaccurate parameters.

9. RESULT’S SUMMARY

In the previous sections, we have described a variety of situations in which the EKF and RHCmethodology can be used to design treatment protocols that can overcome a variety of detrimentalclinical conditions. In this section, we will look at how robust this methodology is over a range ofclinical conditions, including how often the drug efficacy can change, how much noise is in thedata, how many unscheduled interruptions the patient undergoes, how accurate the initial conditionof the patient is known and how accurate the parameters in the model used to predict treatment areknown. The default case we consider is that the drug efficacy can change every 5 days, the noisein the data has an error of 1 standard deviation (based on the works of [29, 30] we have used thestandard deviation of the estimate of CD4+ count to be 22.4 cells/mm3, of the CD8 count to be17.6 cells/mm3 and viral load 20 RNA copies/mm3), no unscheduled treatment interruptions andan accurately known model to estimate treatment protocol. Under these conditions, we simulated500 patients and the algorithm successfully transferred the patient to the healthy equilibrium at arate of 87.6 %. We then varied each one of the conditions and simulated 100 patients. The resultsfor these simulations are summarized in Figure 19.

There are several key findings in this figure. First, this methodology is most sensitive to eitherlong periods without being able to change the drug efficacy; it falls to 26% when the drug canonly change efficacy ever 10 days, or to a poorly known model to predict the treatment strategy.When the model parameters are varied to 10% of their base value, the transfer percentage dropsto 45% and when the parameters are varied to 25% of their base value, the transfer percentagedrops to 33%. These results are not nearly as high as those reported in [13], with a success rateof 98% at 25% model error and a success rate of 90.7% at 30% error. We believe the degradationof success in similar control methodologies is caused by the increased ill-conditioning of thisproblem when using an experimentally validated model. Additionally, these results were not terriblysurprising because, as was reported in [17], this model is quite sensitive to both treatment efficacyand parameter values. Based on the speed of the viral dynamics of HIV, it is not surprising that

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 17: Receding Horizon Control of HIV

RHC OF HIV 697

Figure 19. Summary of control methodology performance over a variety of clinical conditions.

a model fit to clinical data would exhibit this type of behavior. Second, when one can changedrug efficacy often enough and the underlying dynamics are known well, this methodology isquite robust against noise, inaccurate initial knowledge of patient condition and interruptions intreatment. The performance has minimal degradation in percentage of successful transfers in all ofthese conditions. This methodology’s success against measurement noise is similar to that presentedin [13]. This is most likely due to the ability of the EKF to converge to the true patient conditionand filter noise in the data.

10. CONCLUSION AND FUTURE DIRECTIONS

This paper introduced RHC control methodology in the context of an HIV mathematical model.To the authors knowledge, this is the first work to combine RHC and the EKF in the context ofHIV control. It was shown that these techniques can be used to derive treatment strategies thatcan effectively control a model of HIV infection under a variety of adverse conditions. Theseadverse conditions include unscheduled treatment interruption, beginning treatment on a suppressedimmune system, incomplete and noisy data, and an inaccurate model of HIV dynamics. This workalso illustrated the relationship between how often the efficacy of the drugs can be changed and thelength of time necessary to reach the immune dominant steady state. It also explored the number ofinterruptions necessary under several detrimental conditions. The simulation of a large number ofrandomized clinical conditions illustrates both the ability of computational models to perform whatwould be difficult to do in a clinical setting and the robustness and sensitivity of the methodologyto a variety of conditions. This analysis suggests that precise drug efficacy control and accurateunderstanding of viral dynamics are key to implementing a control strategy such as this.

This work sets the stage for a variety of ways: mathematical analysis can be used to improvetreatment of HIV patients. We have dealt with several of the issues related to designing control-based treatment protocols. However, this work also begs the question of several courses of analysis

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 18: Receding Horizon Control of HIV

698 J. DAVID, H. TRAN AND H. T. BANKS

related to better designing these protocols. These improvements include the implementation of afilter that incorporates the censored data produced by many viral assays, a pharmacokinetic modelof specific drugs, incorporation of the potential for drug-resistance in model and intervals of datacollection and treatment changes that are more in line with a clinical condition.

ACKNOWLEDGEMENTS

This research was supported in part by Grant Number R01AI071915-07 from the National Institute ofAllergy and Infectious Diseases and in part by the Air Force Office of Scientific Research under grantnumber FA9550-09-1-0226.

REFERENCES

1. Kirschner D, Lenhart S, Serbin S. Optimal control of the chemotherapy of HIV. Journal of Mathematical Biology1997; 35:775–792.

2. Joshi HR. Optimal control of an HIV immunology model. Optimal Control Applications and Methods 2002;23:199–213.

3. Adams BM, Banks HT, Kwon H-D, Tran HT. Dynamic multidrug therapies for HIV: optimal and STI controlapproaches. Mathematical Biosciences and Engineering 2004; 1(2):223–241.

4. Culshaw RV, Ruan S, Spiteri RJ. Optimal HIV treatment by maximising immune response. Journal of MathematicalBiology 2004; 48:545–562.

5. Fister KR, Lenhart S, McNally JS. Optimizing chemotherapy in an HIV model. Electronic Journal of DifferentialEquations 1998; 1998(32):1–12.

6. Garira W, Musekwa SD, Shiri T. Optimal control of combined therapy in a single strain HIV-1 model. ElectronicJournal of Differential Equations 2005; 2005(52):1–22.

7. Perelson AS, Nelson PW. Mathematical analysis of HIV-1 dynamics in vivo. SIAM Review 1999; 41(1):3–44.8. Banks HT, Kwon H-D, Toivanen JA, Tran H. An SDRE base estimator approach for HIV feedback control.

Optimal Control Applications and Methods 2006; 27(2):93–121.9. Mayne DQ, Rawlings JB, Rao CV, Scokaert POM. Constrained model predictive control: stability and optimality.

Automatica 2000; 36:789–814.10. Qin S, Badgewell T. An overview of industrial model predictive control technology. Chemical Process Control

1997; 93(316):232–256.11. Shim H, Han SJ, Chung CC, Nam SW, Seo JH. Optimal scheduling of drug treatment for HIV infection:

continuous dose control and receding horizon control. International Journal of Control, Automation and Systems2003; 1(3):401–407.

12. Zurakowski R, Messina MJ, Tuna SE, Teel AR. HIV treatment scheduling via robust nonlinear model predictivecontrol. Proceedings of the Fifth Asian Control Conference, Melbourne, Australia, 2004.

13. Zurakowski R, Teel AR. A model predictive control based scheduling method for HIV therapy. Journal ofTheoretical Biology 2006; 238(2):368–382.

14. Zurakowski R. An output-feedback MPC-based scheduling method for enhancing immune response to HIV.Proceedings of the American Control Conference, Minneapolis, MN, 2006.

15. Wodarz D, Nowak M. Specific therapy regimes could lead to long-term immunological control of HIV. Proceedingsof the National Academy of Sciences 1999; 96:14464–14469.

16. Adams BM, Banks HT, Davidian M, Rosenberg ES. Model fitting and prediction with HIV treatment interruptiondata. Bulletin of Mathematical Biology 2007; 69:563–584.

17. David JA, Tran HT, Banks HT. HIV model analysis and estimation implementation under optimal control basedtreatment strategies. International Journal of Pure and Applied Mathematics 2009; 57(3):357–392.

18. David JA. Optimal control, estimation and shape design: analysis and applications. Ph.D. Thesis, North CarolinaState University, 2007.

19. Callaway DS, Perelson AS. HIV-1 infection and low steady state viral loads. Bulletin of Mathematical Biology2002; 64(1):29–64.

20. Culshaw RV. Review of HIV models: the role of the natural immune response and implications for treatment.Journal of Biological Systems 2004; 12(2):123–135.

21. Bonhoeffer S, Rembiszewski M, Ortiz GM, Nixon DF. Risks and benefits of structured antiretroviral drug therapyinterruptions in HIV-1 infection. Acquired Immune Deficiency Syndromes 2000; 14:2313–2322.

22. Adams BM. Non-parametric parameter estimation and clinical data fitting with a model of HIV infection. Ph.D.Thesis, North Carolina State University, 2005.

23. Adams BM, Banks HT, Davidian M, Kwon HD, Tran HT, Wynne SN, Rosenberg ES. HIV dynamics: modelling,data analysis and optimal treatment protocols. Journal of Computational and Applied Mathematics 2005;184:10–49.

24. Ito K, Kunisch K. Asymptotic properties of receding horizon optimal control problems. SIAM Journal on Controland Optimization 2001; 40(5):1585–1610.

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca

Page 19: Receding Horizon Control of HIV

RHC OF HIV 699

25. Kelley CT. Iterative Methods for Optimization. SIAM: Philadelphia, PA, 1999.26. Lewis F. Optimal Estimation: With an Introduction to Stochastic Control. Wiley: New York, 1986.27. Kalman RE. A new approach to linear filtering and prediction problems. Transactions of the ASME—Journal of

Basic Engineering 1960; 82(Series D):35–45.28. Welch G, Bishop G. An introduction to the Kalman filter. Technical Report TR95-041, Department of Computer

Science, University of North Carolina, 2001.29. Murphy DG, Ct L, Fauvel M, Ren P, Vincelette J. Multicenter comparison of roche COBAS AMPLICOR

MONITOR version 1.5, organon teknika nuclisens qt with extractor, and bayer quantiplex version 3.0 forquantification of human immunodeficiency virus type 1 rna in plasma. Journal of Clinical Microbiology39(3):4034–4041, 200.

30. Hagihara K, Yamane T, Aoyama Y, Nakamae H, Hino M. Evaluation of an automated hematology analyzer(CELL-DYN 4000) for counting cd4+ t helper cells at low concentrations. Annals of Clinical and LaboratoryScience 2005; 35:31–36.

Copyright � 2010 John Wiley & Sons, Ltd. Optim. Control Appl. Meth. 2011; 32:681–699DOI: 10.1002/oca