the impact of delays on service times in the intensive ...cc3179/ed2icudelay_2013.pdfthe impact of...

The Impact of Delays on Service Times in the IntensiveCare Unit

Carri W. ChanDecision, Risk, and Operations, Columbia Business School,New York, NY 10027, [email protected]

Vivek F. FariasSloan School of Management, Massachusetts Institute of Technology, Cambridge, MA [email protected]

Gabriel EscobarDivision of Research, Kaiser Permanente, Oakland, CA [email protected]

This version: July 23, 2013

Mainstream queueing models are frequently employed in modeling healthcare delivery in a number of settings, and

further used in making operational decisions for the same. The vast majority of these queueing models assume that the

service requirements of a job are independent of the state ofthe queue upon its arrival. In a healthcare setting, this

assumption is equivalent to ignoring the effects of delay experienced by a patient awaiting care. However, it is only natural

to conjecture that long delays may have adverse effects on patient outcomes and can potentially lead to longer lengths

of stay (LOS) when the patient ultimately does receive care.At a very coarse level, prior research confirms these natural

conjectures. This work sets out to understand these delay issues from an operational perspective. In particular, usingdata

of nearly 6,000 Emergency Department (ED) visits, we use an instrumental variable approach to empirically measure

how congestion in the Intensive Care Unit (ICU) can lead to delays in boarding from the ED to the ICU and measure the

impact on the patient’s ICU LOS.

Capturing these empirically observed effects in a queueingmodel is challenging as the effect introduces potentially

long range correlations in service and inter-arrival times. As such, we consider the problem of how to incorporate these

measured delay effects into a queueing model and characterize approximations to various quantities of interest when the

service time of a job is adversely impacted by the delay experienced by that job. Our findings suggest that this delay

effect can be substantial and ignoring it when using queueing models to model healthcare delivery systems may result in

significant under-provisioning.

Key words: Delay effects, queueing, Healthcare

1. Introduction

Delays arise routinely in various healthcare settings: they are a consequence of the inherent, highly variable

requirements of healthcare services and the overwhelming demand for these services. It is natural to conjec-

ture that delays in receiving the appropriate care can result in a variety of adverse outcomes – and indeed,

there is some support for such conjectures. This paper proposes to study one such adverse outcome in the

intensive care setting: delays in receiving intensive carecan result in longer lengths of stay in Intensive

Care Units (ICUs). From an operational perspective, this effect has two consequences. The first, of course,

1

2

is the immediate impact on the delayed patient. The second,systemicimpact is the increased congestion

caused by the increased care requirements for the delayed patient. In particular, the increased ICU length

of stay can result in delays toother patients requiring the same ICU resources, which in turn results in

longer lengths of stay for those patients, and so forth. Thispaper will (empirically) study the extent of this

phenomenon. We then propose to modify extant queueing models (that are frequently used to model such

systems) to account for the phenomenon and present a theoretical analysis for the same.

Delays and the ED-ICU Interface: One place where delays are apparent is in the Emergency Depart-

ment (ED). Due to growing demands and reductions in the number of physicians, nurses and beds, EDs are

often overcrowded (Burt and Schappert 2004). The overall median wait to see an ED physician increased

from 22 minutes in 1997 to 30 minutes by 2004 (Wilper et al. 2008). A number of factors can contribute to

delays. For instance, delays may be due to an overload of patients in the ED and insufficient resources to

treat patients in a timely manner. Delays can also occur following assessment and stabilization of patients.

About 10% of patients initially admitted to the hospital through the ED are treated in an ICU; here patients

can experience further ‘boarding’ delays due a multitude offactors ranging from congestion in the ICU

which forces patients to wait in the ED until an ICU bed becomes available (see Litvak et al. (2001)) to

coordination mishaps.

Turning attention to the ICU, we note that ICUs typically provide the highest level of care with one nurse

for every one to two patients. These units are very expensiveto operate and typically require 20% of hospital

operating costs despite only consisting of 10% of the beds (Rivera et al. 2009). Consequently, these units are

often operated at or ‘above’ capacity. Hospitals have developed a number of approaches to deal with ICU

congestion. For instance, ICU congestion can result in discharging current patients preemptively (Chalfin

2005, Dobson et al. 2010, Kc and Terwiesch 2012, Chan et al. 2012), blocking new patients via ambulance

diversion (Allon et al. 2013) or rerouting patients to different units (Thompson et al. 2009, Kim et al. 2012).

In this work, we focus on a frequent symptom of this congestion: admission delays. With an increase in

critical care usage (Halpern and Pastores 2010) and a relatively stagnant supply of ICU beds, it is no wonder

that delays for patients awaiting ICU admission are growing.

This paper will focus on the flow of patients from the ED into the ICU. In particular, we will examine

the ‘boarding’ delay experienced by these patients and the impact of this delay on the length of the patients

stay in the ICU.

Standard Queueing Models Fall Short: Queueing models are often used to model and analyze patient

flows in hospital settings. These models are predictive and can provide valuable insight into the impact of

changing demand scenarios as well as staffing, or more generally capacity provisioning alternatives. See

Green (2006) for an overview of how queueing models have beenused in healthcare applications. The

3

vast majority of these queueing models assume that the service requirement of a job is independent of the

state of the queue upon its arrival. In a healthcare setting,this assumption is equivalent to ignoring the

effects of delay experienced by a patient awaiting care. As we show at a granular level in this paper, this

is not a tenable assumption. In addition, there have been various condition specific studies in the medical

community demonstrating that delays can result in an increase in mortality (de Luca et al. 2004, Chan et al.

2008, Buist et al. 2002, Yankovic et al. 2010) and/or extend patient Length-of-stay (LOS) (Chalfin et al.

2007, Renaud et al. 2009, Rivers et al. 2001).

As we shall see, even in the simplest settings, a natural queueing model that captures the impact of

delays on service time, can be modeled as a high dimensional Markov chain which does not appear easy

to analyze. This is not surprising, since capturing the delay effect creates long-run correlations between

service times and inter-arrival times and very little can besaid about such systems. While such models may

still be beneficial in simulation, the queuing phenomena made transparent by simpleG/G/n type models

is obscured. As such, an important component of this paper isa simple set of closed-form approximations

to key performance metrics for such systems.

Questions and Contributions: In this work, we focus on the effect of delay on patient lengthof stay in

the ICU and characterize the potential congestion caused byany increase in ICU length of stay due to this

effect. In particular we consider the following questions:

1. What is the relationship between an additional hour of waiting for critical care and additional LOS

for a patient when she does eventually receive this care? We answer this question using empirical data

on patient flows from a large hospital network. Our study focuses on how delays in boarding from the

Emergency Department (ED) to the ICU impact patient LOS in the ICU. Our empirical study is granular

and characterizes the magnitude of this effect for a varietyof patient primary conditions. We find strong

evidence for the conjecture that increased ED boarding times are associated with longer ICU lengths of stay.

Loosely, for some primary conditions (such as catastrophicpatients), a single additional hour of boarding

delay (relative to mean delay) is associated with approximately four additional hours in the ICU (relative to

the mean LOS for that class of patients).

2. Can we incorporate such delay effects in the queueing models we use for capacity planning? The

natural analogue of anM/M/s queueing model unfortunately calls for the analysis of a high dimensional

Markov chain which is analytically intractable and obscures queueing phenomena. We present a rigorous,

analytically tractable approximation to such models that,in addition to being quite accurate, provides a

simple, transparent view of the impact of congestion on performance metrics of interest in thepresenceof

the delay effect. This, in turn, allows for the same flexibility of anM/M/s model while accounting for

the delay effect. We view the simplicity of these approximations as surprising since queueing systems with

long-range correlations in service and inter-arrival times are known to be notoriously difficult to analyze.

4

While physicians recognize that delays are detrimental foran individual patient, our analysis provides

insight into the impact such delays may have on increasing overall congestion and reducing access to care

for other critical patients. Perhaps the most important operational insight that arises from our work is the

extent to which interventions that decrease boarding delays can have on key system measures. In particular,

our analysis reveals that such interventions can prove justas important as capacity augmentation! Via our

empirical and theoretical analysis, we demonstrate that ignoring the delay effect when using queueing mod-

els to analyze healthcare operations can result in severe under-provisioning. Moreover, ignoring such delay

effects and the subsequent increase in congestion may result in hospitals utilizing other congestion control

measures, such as ambulance diversion, more frequently.

The rest of this paper proceeds as follow. We first review somerelated literature in Section 1.1. Section 2

provides empirical motivation for our delay-sensitive queueing model. Section 3 presents a simple queueing

models which incorporates state-dependent service times.We examine this model in a Markovian frame-

work in Section 3.2. In Section 4, we develop approximationsfor the system backlog and demonstrate that

the impact of delays can be substantial. In Section 5, we examine the performance hof these approximations.

Section 6 concludes.

1.1. Related Literature

The medical community has invested significant effort into measuring the detrimental impact of delays on

patient outcomes. The majority of this work has focused on a binary notion of delay: was a patient delayed

or not? For instance, a transfer from the Emergency Department (ED) to the Intensive Care Unit (ICU) was

labeled as ‘delayed’ if it was greater than6 hours (Chalfin et al. 2007); however, there was no distinguishing

between6 and20 hours of delay. They find that the median hospital length of stay (inclusive of ICU and

general medical ward stay) is 1 full day longer and the in-hospital mortality rate was 35% higher for patients

who were boarded more than 6 hours. The definition of delay varies across different medical conditions

and scenarios. Renaud et al. (2009) compares the outcomes ofpneumonia patients who are transferred to

the ICU within 1 day (non-delayed) versus 3 days (delayed) ofpresenting symptoms. They find that the

median hospital LOS and 28-day mortality rate is nearly twice as high for delayed patients. The order of

magnitude for delay can be in minutes as in the case of cardiacpatients (de Luca et al. 2004, Buist et al.

2002, Yankovic et al. 2010, Chan et al. 2008) or up to5 days for burn-injured patients (Sheridan et al. 1999).

All of these works focus on a single patient condition in a single hospital and may lead one to conjecture

that the delay effect is isolated to a narrow section of the patient population that visits the ICU. We verify

instead that the delay-effect is prevalent across multiplehospitals and ailments.

In this work, we focus on how operational factors contributeto delay. Specifically, we empirically exam-

ine the impact of ICU occupancy levels on ED boarding, where boarding time is defined as the time a patient

5

spends waiting in the ED for an inpatient bed assignment after a bed has already been requested. Addition-

ally, we consider how this delay impacts ICU LOS (as opposed to hospital LOS as the prior medical works

have considered). We are interested in examining the adverse feedback where congestion induces delays

which further increases congestion.

Shi et al. (2012) also consider ED boarding, but focuses on the impact of hospital discharge policies on

patient boarding. Similar to our work, they consider empirical analysis to motivate stochastic models. Using

simulation models, they approximate inpatient operationsin a hospital in Singapore. In our work, we aim to

provide analytic approximations to the impact of ED boarding on system dynamics such as average number

of patient hours in the system.

Most related to our empirical analysis is the works of Kc and Terwiesch (2009, 2012) and Anderson et al.

(2011). The authors consider how high load impacts ICU LOS following surgery. These works find that

high occupancy levels can result inshorterpatient length-of-stay (LOS) due to a need to accommodate

new, more critical patients. Moreover, such reductions in LOS can increase risks for readmission and death.

In contrast, our work considers theadmission, instead of discharge, process which is altogether a funda-

mentally different medical decision. In particular, we examine how the occupancy level in the unit which

a patient should be admitted canincreaseLOS in the current and subsequent unit. Kim et al. (2012) also

considers the impact of the occupancy levels of downstream hospital units; however, the focus is on how

high occupancy levels can affect patient routing and subsequently, patient outcomes. In the present work,

we focus the ICU and how congestion impacts delays rather than the routing to a potentially less desirable

recovery unit.

Motivated by our empirical findings, we consider how to incorporate the measured delay effect

into our queueing models via state-dependent dynamics. There have been a number of works which

have considered state-dependent queueing systems. Powelland Schultz (2004), Ata and Shnerson (2006),

George and Harrison (2001) all consider queueing systems where service times can be increased or

decreased depending on congestion. In general, they find that service rates shouldincreasewith congestion.

Ata and Shnerson (2006) analyze an M/M/1 queue where servicetimes can be reduced during congestion.

They consider a control problem of how to vary arrival rates,service rates, and prices depending on system

congestion. They find that the arrival rate should be decreased while the service rate should be increased

as the number of customers in the system grows. In contrast, we study a system where the service rate is

not controlled but a function of the system’s history and tackle the long range correlations that these effects

result in.

Anand et al. (2010) examines the quality-speed tradeoff in an M/M/1 queue where service times can be

reduced at the expense of service quality while reducing delay costs. They find that the equilibrium behavior

6

of a queueing system with service rates which vary with congestion is starkly different than in traditional

queueing models. We also compare the impact of congestion-dependent service times to traditional queue-

ing models. Our setting differs in two main factors: 1) service timesincreasewith congestion, we cannot

choose whether to increase or decrease them and 2) we focus onthe steady-state distribution of the queueing

system rather than the equilibrium control decisions.

Whitt (2003) considers how congestion increases with demand in anM/M/n system. In particular,

the arrival rate increases with congestion, whereas our service rate decreases. The arrival rate increases

with the number of serversn, but is strictly decreasing in a congestion measure which depends on the

number of servers. Depending on the congestion measure, different heavy-traffic regimes appear, which

can be used to estimate delay probabilities. While we also approximate the steady-state dynamics of a

congestion-dependent queueing system, we use a different approximation approach and focus on the impact

of congestion on service times, not arrival rates.

A number of approaches utilize limiting regimes to establish approximations for steady-state distributions

of state-dependent systems. For instance, Armony and Maglaras (2004) consider a system where customers

can select their service type, resulting in state-dependent arrival rates. Using approximations achievable

via analysis in the Halfin-Whitt regime, they establish estimates of the steady-state distributions of waiting

times. Mandelbaum and Pats (1998), Mandelbaum et al. (1998)use fluid approximations to approximate

state-dependent queueing networks. We also generate approximations of the steady-state distributions; how-

ever, we use a different approach by providing exact analysis for an upper bounding queueing system.

Perhaps the closest to our work is that of Whitt (1990) and Boxma and Vlasiou (2007) which examine a

G/G/1 queue with service times and interarrival times whichdepend linearly on delays. Under very special

conditions–e.g. the workload must decay over time, or interarrival times must increase as service rates

decrease–stability conditions and approximations to the waiting times can be derived. While both of these

works consider workload that may increase with delay, the dynamics of our system are very different. In

particular, we do not allow for the changes in interarrival times required for the results in Whitt (1990) and

Boxma and Vlasiou (2007). Consequently, the workload in oursystem will never decay as it must in the

aforementioned works.

While there has been important work focusing on state-dependent queueing systems, they are unable to

fully capture the healthcare specific dynamics which are estimated from real hospital data and presented in

this paper. Our goal is to develop a framework which accountsfor the type of delay effect which can appear

in a healthcare setting. In doing so, we hope to expand the wayqueueing models can be used in such a

setting. Queueing theory has been a useful tool to estimate performance measures, such as waiting times,

and to provide support in operational decision making, suchas determining staffing levels. For instance,

7

Yankovic and Green (2011) consider a variable finite-sourcequeuing model to determine the impact of

nurse staffing on overcrowding in the Emergency Department.In a related vein, de Vericourt and Jennings

(2011) consider an M/M/s//n queue to estimate the impact of nurse-to-patient ratio constraints on patient

delay. Green et al. (2006) modified the traditional M/M/s queueing model to develop time-varying staffing

levels for the Emergency Department. To the best of our knowledge, despite the ever-present delay effect in

healthcare applications, no other works have explicitly taken it into account.

2. Empirical Motivation: Model and Analysis

In this section, we empirically examine delays for patientsbeing transferred from the ED to the ICU. Since

delays can be caused for a number of reasons, we intend to focus on congestion related delays; i.e. delays

in the ED due to the unavailability of a bed in the ICU. We find that delayed transfers from the ED to the

ICU due to high ICU occupancy levels are associated with significant increases in ICU LOS. These findings

have significant implications for capacity planning and resource allocation in the ICU.

We will posit and estimate a reduced form model that relates patient physiological factors and ED board-

ing time (i.e. the delay between when a bed in the ICU is requested for that patient and the time the patient

is actually physically transferred from the ED to the ICU). The model permits the impact of boarding time

to be different across different patient categories.

2.1. Data

We analyze a large patient data set collected from 19 facilities within a single hospital network for a total

of 212,064 patient visits over the course of 1 year. This dataincludes patient level characteristics such as

age, sex, primary condition for admission (i.e. congestiveheart failure or pneumonia), and four separate

severity scores based on lab tests and comorbidities. It also includes operational data which tracks each

patient through each unit, marking time and dates of admission and discharge. Hospital units were classified

into six broad categories including Emergency Department (ED), General Medical Ward, Transitional Care

Unit (TCU), Intensive Care Unit (ICU), Operation Room (OR),and Post Anesthesia Recovery Unit. As

this was aninpatientdataset, the captured time in the ED is the time difference between the order to admit

to an inpatient unit and when the patient actually left the emergency department. Hence, this captures the

ED boarding timeand is measured as the time from when the admit order was placed until the patient is

physically admitted to an inpatient unit. Note that this does not include the time for triage, stabilization, and

assessment, all of which will typically be activities that occur prior to the request for an ICU bed.

Severity scores in the data were determined at the time of hospital admission and capture the severity of

the patients at the time the request for an ICU bed was made. Inorder to use these scores for risk adjustment,

we excluded all patients who were admitted to the ICU more than 48 hours after hospital admission since it

8

is unlikely the scores will accurately measure the severityof patients after that. These scores are used for the

over 3 million patients in this hospital network and have similar predictive power as the APACHE and SAPS

scores withc statistic in the 0.88 range (Zimmerman et al. 2006, Moreno etal. 2005). See Escobar et al.

(2008) for further description of these severity scores.

To understand the impact of delay on different patient types, we classify patients based on over 16,000

ICD9 admission diagnosis codes into 10 broad groups of ailments based on the types of specialists who

treat them: Cancer, Catastrophic, Cardiac, Fluid&Hematologic, Infectious, Metabolic, Renal, Respiratory,

Skeletal, and Vascular (Escobar et al. 2008). While there are some patients who do not fall into one of these

categories, we focus on these main groupings which the majority of patients fall under. The dataset we

analyzed consisted of over 102,800 ED patients, 7,700 patients of which were transferred to the ICU within

48 hours of hospital admission.

We consider patients whose admission was classified as ‘ED, medical’, i.e. their admission was via the

ED and their ailment was not considered surgical. 900 patients were removed from the sample because they

died. This is common practice in the medical community because various factors, such as Do-not-resuscitate

orders, can skew LOS estimates for patients who die (Norton et al. 2007, Rapoport et al. 1996). We note

that we verified the robustness of our empirical analysis by also including patients who died and find our

results are quite similar. When determining occupancy levels, all patients are included.

The final dataset consisted of 5,996 ED patients who survivedto hospital discharge and were transferred

to the ICU within 48 hours of the admission decision in the ED.The average ICU LOS for these patient

classes was 56 hours with the maximum ICU LOS of nearly 37 days. The average ED boarding time was

3.5 hours. The average age of the patients was 64. Table 1 summarizes the statistics for the different patient

categories. The average occupancy of the ICU was 70%.

Condition Category Number of ED Boarding Time ICU LOS AgePatients Mean± Std. Mean± Std. Mean± Std.

Cancer 27 4.28± 4.99 52.50± 36.96 64.89± 11.01Catastrophic 685 2.77± 3.91 87.15± 83.70 62.20± 18.37Cardiac 2203 3.57± 4.21 37.75± 36.59 66.07± 14.31Fluid&Hematologic 164 4.30± 5.24 45.78± 47.79 64.70± 16.10Infectious 1012 3.85± 4.71 74.85± 84.08 65.73± 16.86Metabolic 650 2.87± 3.30 51.70± 57.21 48.64± 19.92Renal 123 3.49± 4.62 64.04± 63.75 60.67± 16.44Respiratory 741 3.32± 4.05 65.50± 75.96 66.30± 15.62Skeletal 98 4.83± 5.78 52.70± 55.09 66.00± 18.70Vascular 293 3.27± 3.92 53.01± 42.01 69.72± 13.69

Table 1 Summary Statistics for 10 patient categories

9

2.2. Hypotheses

We wish to understand how ICU occupancy levels can impact delays to ICU admission and, in turn, how

this delay impacts patient ICU LOS. We consider the following hypotheses which are primarily motivated

by evidence in the medical literature as well as the medical expertise of one of the coauthors:

1. When the ICU is busy, patient admissions may be delayed. This results in an increase in ED Boarding

time for patients who are to be admitted to the ICU.

∂ED BOARD

∂ICU OCC> 0

2. The ‘delay effect’: ICU Admission delays can hurt patients’ outcomes. ICU LOS is increasing in ED

Boarding time.∂ICU LOS

∂ED BOARD> 0

While both hypotheses are natural to conjecture, the significance these phenomena can play in capacity

management (as we will see in the subsequent sections) merits that we establish their veracity rigorously. In

addition, the empirical study in this section will also allow us to quantify the magnitude of the delay effect

for different classes of patients.

2.3. Estimation Model

We now describe our reduced-form model which forms the basisfor our estimate of the impact of boarding

delay on ICU LOS. To test hypothesis 1, we regress ED boardingtime for patienti,ED BOARDi, against

a measure of ICU occupancy and patient specific physiological variables. In particular, we letICU BUSY

be an indicator for the ICU being in a busy state, as will be described in detail later. Further, letXi be a

vector of various physiologic and operational factors which may affect ICU LOS as well as ED boarding

time, such as patient severity, age, primary condition, dayof admission, and hospital where care is received.

One ofXi’s components is a constant. Our model is then:

ED BOARDi = βTXi+ γICU BUSYi + εi (1)

whereεi is assumed to be zero-mean noise uncorrelated withXi and ICU BUSYi. The coefficientγ

measures the relationship between ICU occupancy levels andED Boarding time:γ > 0 would support

hypothesis 1.

To test hypothesis 2, we consider the ICU LOS of patienti, ICU LOSi, and the ED boarding time for

that patient,ED BOARDi. LettingXi be the same vector of features as before, our model is then:

ICU LOSi = βTXi +∑

j

δjED BOARDi1{AILMENTi=j} + νi (2)

10

wherej indexes the set of possible ailments. The zero-mean noise term νi is assumed to be uncorrelated

with Xi. The coefficientδj may be interpreted as measuring how each additional hour of ED Boarding

increases expected ICU LOS for ailment groupj: δj > 0 would support hypothesis 2.

Instruments: We chose to not assume thatνi andED BOARDi are uncorrelated; correlation between

these two variables can arise for several plausible reasons, one of which is the impact unobserved patient

severity can have on bothICU LOSi andED BOARDi. An exceptionally severe patient may naturally

require a longer length of stay in the ICU (due to the increased time required for recovery). The same

patient may also be prioritized in any scheduling which could lead to shorter boarding times for that patient.

In particular, in such an event we would expectED BOARDi andνi to be negatively correlated. Since

such exceptional factors are unobserved in the model, the negative correlation, if ignored would result in

underestimatingδ. To address this issue we require suitable instrument variables.

The occupancy level in the ICU is unlikely to be correlated with patient severity but is likely corre-

lated with the boarding time experienced by the patient and hence constitutes an excellent candidate for

an instrumental variable. In particular, we useICU BUSYi as our instrumental variable. Our instrumental

variable regression permits an attractive interpretationas a two-stage regression: we replaceED BOARDi

in model (2), with ED BOARDi, thepredictedED Boarding time based on model (1).

2.4. Empirical Results

We first consider the impact of a busy ICU on ED Boarding. We define an ICU as ‘busy’ if the occupancy

level is greater than 80% of the maximum patient census over the course of the year. Because beds can be

flexed by bringing in additional staff, this is likely a lowerbound on the actual occupancy level. Moreover,

it is possible that the delay effects will be seen prior to 100% occupancy as some beds may be reserved in

anticipation of patient arrivals from other hospital units, such as the Operating Room. This characterization

of the ICU being busy is similar to the approaches taken in Kc and Terwiesch (2012), Kim et al. (2012),

Chan et al. (2012) and Batt and Terwiesch (2012) among others. Note that we examined other measures of

busy, including different thresholds and times at which theoccupancy was measured. The results are similar,

so we have included the most statistically significant ones.

With p < .001, ED Boarding time increases by 1.3463 hours when the ICU occupancy level is greater

than 80%. This result supports hypothesis 1 and further supports using ICU occupancy as an instrumental

variable in Model (2).

We now consider the impact of ED Boarding on ICU LOS. As a measure of model robustness, we

consider two models: the first does not use any instrumental variables and the second uses ICU occupancy

as an instrument for ED Boarding as discussed earlier. Table2 summarizes the delay effects for the 10

primary condition categories of interest. We see evidence of an endogeneity bias, especially in the case of

11

Renal patients, where it seems that increased boarding timeactually reduces ICU LOS. This goes against

medical knowledge and intuition. We can see that the instrument is able to adjust for this bias. When we use

the IV, all of the coefficients which capture delay effects increase; all statistically significant coefficients in

this case are positive.

(i) (ii)Without IV With IV

δCancer 1.8272 1.6828(2.3566) (5.7258)

δCatastrophic -0.8235 3.7687∗∗

(0.5897) (1.8169)δCardiac 0.0242 2.5668∗

(0.3232) (1.4904)δFluid&Hematologic 0.3411 6.3874∗∗

(0.8982) (2.5725)δInfectious 0.5382 3.1042∗

(0.4094) (1.6461)δMetabolic -0.4322 2.6632

(0.7218) (1.8336)δRenal -2.3084∗ -1.7586

(1.1789) (2.7339)δRespiratory 0.2935 4.2770∗∗

(0.5508) (1.7418)δSkeletal -0.8038 -0.6972

(1.0555) (3.1126)δV ascular -0.6246 3.8718∗

(0.9028) (2.2522)Standard errors in parentheses.∗p< 0.10; ∗∗p < 0.05; ∗∗∗p< 0.01

Table 2 ICU LOS regression results: (i) without instrumenta l variables; (ii) uses ICU Occupancy > 80% at

ICU admission time as an instrumental variable.

We can see that for patient categories: Catastrophic, Cardiac, Fluid & Hematologic, Infectious, Respira-

tory, and Vascular the delay effect is statistically significant (p < .10). For these ailments, 1 additional hour

in ED delay is associated with an increase in ICU LOS by 2.5-6.5 hours. As we will see in our analysis of

queueing systems with delay-dependent service times, thisimpact can be substantial.

We do not see any statistically significant results for patient conditions Cancer, Metabolic, Renal, and

Skeletal. Cancer, Renal and Skeletal are the patient conditions with the fewest number of patients, so the

lack of statistically significant results may be attributedto the small sample size. There are 650 samples

of Metabolic patients, yet it seems that delays may have little impact on ICU LOS. This may be because

Metabolic, along with the Cancer and Renal categories, corresponds to chronic conditions including Dia-

betes, immune disorders, end stage renal disease, etc. Subsequently, these patients may be more delay

12

tolerant. While the patients are considered severe (they still need ICU care), there is likely to be less urgency

when the patient’s primary condition for admission is chronic. Finally, Skeletal refers to conditions such as

broken hips, which may be susceptible to infection if left untreated; however, their urgency is likely to be

lower than other patients such as those with infections in the blood (Infectious). Hence, these four categories

which do not display statistically significant results for the delay effect seem to correspond to the conditions

which are most likely to have little to no relationship between ICU LOS and delayed admission.

We note that prior work has demonstrated that when the ICU is busy, patient LOS may decrease

(Kc and Terwiesch 2012). In their work, they focus on a singlecardiac ICU where patients are cared for fol-

lowing cardiac surgery. In our case, we do not consider surgical patients. We focus on ED medical patients.

Kim et al. (2012) shows that scheduled surgical patients aremost likely to experience speedup when the

ICU is busy, while ED medical patients do not seem to experience speedup when the ICU becomes con-

gested. Our data is consistent with these findings. Moreover, our findings are robust to controls for the

possibility of speedup.

From our empirical analysis, it is clear that, for a large group of patients, delays in ICU admission are

associated with substantial increases in ICU LOS. As expected from the medical literature, the impact of

delays varies across different patient conditions. We nextdevote our attention to understanding the implica-

tions of this delay effect on traditional queueing insights.

3. Incorporating the Delay Effect: M/M(f)/s Model

Motivated by our empirical analysis, we turn our attention to developing queueing models which incorporate

the delay effect. Such analysis allows one to measure the impact of ignoring the delay effect when using

conventional queueing approaches. To do this, we introduceanM/M/s-like queueing system which has

jobs with delay-dependent service times. Our analysis assumes a single patient class in order to focus on

the impact of the delay effect. Such an assumption is reasonable in hospitals with specialized ICUs. For

instance, some large hospitals have dedicated cardiac ICUswhere non-surgical cardiac patients are given

priority.

We begin with a Markovian queueing system and modify it to account for the delay phenomenon. In

particular, we consider a model wherein the service time of ajob is inflated from some nominal value by a

quantity which depends on the number of jobs in the queue uponthe job’s arrival. Hence, the service rate

of the standard exponential random variable depends on the delay of the job; we denote this dependence

by M(f) wheref is an ’inflation’ function that we will define shortly. Such a model is able to capture the

dynamics estimated from the patient data in the previous section.

We now formally introduce our delay-dependent queueing system. Consider ans server queueing system

described as follows: Jobs arrive according to a Poisson process at rateλ and are served in FIFO fashion.

13

We letNt denote the number of jobs in the system at timet. Jobi arrives at timeti and it’s service time

is exponentially distributed with mean1 + f(

Nt−i

)

wheref(·) is a growth function which satisfies the

following requirements:

1. f(m) = 0 for m= 0.

2. f(·) is bounded and non-decreasing.

In what follows, we will examine the behavior of this system and the impact of the growth function,

f(m). We will refer to such a system as a queueing system with delaydependent workload, and abbreviate

it with the notationM/M(f)/s.

3.1. Stability of an M/M(f)/s system

We first begin our analysis of our queueing system with delay-dependent service times by considering the

stability for such a system. While the stability condition,and consequently the throughput of anM/M(f)/s

system, is a relatively coarse performance benchmark, it provides interesting insight into the behavior of

such systems. We have that:

Proposition 1AnM/M(f)/s system is stable if and only if

λ

s≤

1

1+ fmax

wherefmax is the maximum value taken on byf(·).

The proof of this result can be found in the appendix. To provide some intuition of this result, if a burst of

jobs arrive, they will all experience some delay and an increase in service requirement. If a particularly bad

burst of jobs arrive in sequence, the system will quickly deteriorate to the point where all jobs are delayed

and require maximal service time. Hence, the stability requirement is based on the maximum possible job

requirement. While the question of stability reduces to thestandard stability characterization under the

worse-case scenario of all jobs inflating maximally, the system dynamics are more nuanced.

3.2. A Markovian Model

Our delay-dependent queueing system can be represented as amulti-dimensional Markov Chain. For the

sake of concreteness and simplicity of exposition we will consider a very simplef(·), and simply indicate

corresponding results for generalf(·). In particular, we assume that the workload increase function,f(·), is

defined as follows:

f(m) =

{

0, m<N∗;k, m≥N∗.

for some threshold occupancy level,N∗ > 0. Hence, the mean service time of each job is1 if there are fewer

thanN jobs in the system upon arrival and1+k otherwise. IfN∗ = s, this means any job which is delayed

14

will have an increased service requirement. Relating back to our empirical findings, the increase in service

requirement seems to occur if a new job sees an occupancy level of 80%, corresponding toN∗ = .8× s.

Let X = (XN ,XD) be the system state whereXN is the number of jobs in the system who arrived with

less thanN∗ jobs currently in the system. Note that due to the FIFO and non-preemptive service discipline,

if XN > 0, then necessarily there are(XN ∧ s) jobs currently in service at rate1. The remaining servers,

(s−XN)+, will be serving jobs at rate1

1+kif any are available. Otherwise, they will idle. We can verify

that the Markov Property holds for our state as defined.

Proposition 2AnM/M(f)/s system can be represented as a Markovian system with stateX = (XN ,XD).

PROOF: We show that the Markov Property holds for our system. We letX(i) = (XN(i),XD(i)) be the

state at theith state transition. What’s left to show is that

P (X(i+1)= (xN , xD)|X(0),X(1), . . . ,X(i− 1),X(i)) = P (X(i+1)= (xN , xD)|X(i))

We demonstrate this by considering the precise transition probabilities:

P (X(i+1) = (xN , xD)|X(0),X(1), . . . ,X(i− 1),X(i) = (x′N , x

′D))

=

λ

λ+(x′N∧s)+

x′D

∧(s−x′N

)+

1+k

, if (xN , xD) = (x′N +1, x′

D) andx′N +x′

D <N∗;

λ

λ+(x′N∧s)+

x′D

∧(s−x′N

)+

1+k

, if (xN , xD) = (x′N , x

′D +1) andx′

N +x′D ≥N∗;

x′N∧s

λ+(x′N∧s)+

x′D

∧(s−x′N

)+

1+k

, if (xN , xD) = (x′N − 1, x′

D);

x′D∧(s−x′N )+

1+k

λ+(x′N∧s)+

x′D

∧(s−x′N

)+

1+k

, if (xN , xD) = (x′N , x

′D − 1);

0, otherwise.

(3)

= P (X(i+1)= (xN , xD)|X(i) = (x′N , x

′D))

It is clear that the transition probabilities depend only onthe current state and are independent of the past.

2

The transition matrix for this Markov Chain has a block diagonal structure. However, despite this struc-

ture, solving for the steady-state dynamics involves solving a high dimensional matrix inversion. While one

may be able to solve this numerically, it does not provide much insight for the general model. Moreover,

this approach quickly becomes intractable with more general f functions. The state-space must grow in the

number of break-points in the functionf , so that the block sizes in the transition matrix grow exponentially

in the number of break-points.

Despite starting from the innocuousM/M/s queueing model, the introduction of the delay effect makes

the resulting system far too difficult to permit an exact analysis. As such we focus on producing approxi-

mations to quantities of interest (such as the expected workload) by constructing suitable upper bounding

systems. This analysis provides some insight into how the issues above might impact nominal predictions

that do not account for the impact of delay on service time.

15

4. Approximating The Workload Process

This section will be concerned with establishing and interpreting a simple (and fairly accurate) approxima-

tion to the long run average work load of anM/M(f)/s system. In particular, let us denote byWt and

Nt respectively, the workload and number in system processes in this system. Consider also, anM/M/s

system with arrival rateλ and service rate 11+fmax

wherefmax =maxm f(m). Assume the service discipline

for this system is FIFO. We denote byW t and andN t respectively, the workload and number in system

processes in this system. We will frequently refer to the former system (the system we are interested in

analyzing) as system 1 and the latter system (which will havevalue in our producing bounds) as system

2. Finally, we denote byW t, the workload process in anM/M/s system with arrival rateλ and service

rate1, i.e. a systemwithoutany delay-effect or relationship to the growth functionf(m). We will refer to

this system as the baseline, delay-independent system and use it’s behavior as a comparison benchmark for

our M/M(f)/s system and the corresponding bound we will establish. We letE[W ],E[W ], andE[W ]

denote the expected work in each system. That is, if we start the systems according to their respective sta-

tionary distributions, then these correspond to the expected work in each system at time0: E[W ] =E[W0],

E[W ] =E[W 0], andE[W ] =E[W 0]

4.1. An Upperbound for A Step Function

In order to provide more insight into the bound we will derive, we start by examining a special case of the

delay-growth function,f . In particular, we focus on the case where jobs have nominal service requirement

of mean1 which increases to1+ k if there areN∗ or more jobs in the system upon arrival:

f(m) =

{

0, m<N∗;k, m≥N∗.

Such a delay growth function captures the increased servicetime required by jobs (patients) who arrive to

a congested system (i.e.,m ≥N∗). As described in Section 3.2, we can relate this delay-growth function

directly to our empirical study by appropriately definingN∗. This bears similarities to some of the medical

literature which examines the increase in workload of delayed versus not delayed patients (Chalfin et al.

2007, Renaud et al. 2009). Moreover, we consider the case where the service times are exponentially dis-

tributed. We can establish the following upperbound:

Theorem 1Assume thatf(·) is defined according tof(m) = k for m≥N∗, andf(m) = 0 otherwise. We

have that the expected workload,E[W ], satisfies

E[W ]≤E[W ]−λ(2k+ k2)P (N <N∗)

whereW andN denote the workload and number of jobs in a traditionalM/M/s system with arrival rate

λ and service rate1/(1+ k).

16

The upperbound consists of the amount of work in the system ifall jobs were inflated, which is then

corrected according to the second term in the bound. To provide some intuition of the correction term, let’s

consider the case whereN∗ = s and examine the amount of work contributed by an arbitrary job, i. We

note that we correct for the extra amount of work that is introduced whenever a job does not have to wait

upon arrival, i.e.Nt−i< s. A job that immediately begins service contributes a total of 1

2W 2

i work, i.e. it

brings workWi that is depleted at constant rate 1 until it completes service. The total contribution is then

the area of the right triangle with width and height equal toWi. Because this job does not have to wait, the

amount of work that is actually contributed isW2i

2(1+k)2, which accounts for the artificial inflation of the work

to expected size1+ k. Therefore, to account for the actual amount of work introduced by a job who does

not have to wait, we subtract the amount of work contributed by the inflated job12W

2

i and add the amount

of work by the correct mean1 sized job: W2i

2(1+k)2. See Figure 1 for an illustration of accounting to correct for

the excess work introduced. Recognizing that the second moment of an exponential random variable with

meanµ is µ2, we derive the desired result.

��

��

W i

W i1+k

ti ti +W iti +W i1+k

12W

2

i −12

(

W i1+k

)2

Figure 1 Due to the inflation of all jobs, each job which experi ences zero delay contributes excess work

which is shaded in gray.

Note, that fork= 0, we recover the results for a queueing system without delay-dependent service times.

In the case of Markovian dynamics, we recover the classical results of anM/M/s queue. The first expres-

sion in the upper bound corresponds to a system whereall jobs have their service time increased, irrespective

of the amount of delay experiences. However, the workload does not unilaterally increase with the load.

The second part of the expression represents the correctionfor over inflating the workload for jobs which

do not experience excess congestion. We note that this is an upper bounding system because, while we

account for the correct workload if a job is not delayed, we donot correct for the propagation effect of it’s

inflated workload on delays for future jobs. Still, the upperbound is quite accurate for systems with various

growth factors,k, and numbers of servers,s. Figure 2 demonstrates the accuracy of the derived upperbound

in comparison to the simulated workload of theM/M(f)/s system.

17

0.3 0.4 0.5 0.6 0.7 0.80

5

10

15

20

25

30

λ

E[W

]

UB M/M(f)/sM/M(f)/s

k=.2

k=.1

k=.05

(a) 1 server

0.6 0.8 1 1.2 1.4 1.60

5

10

15

20

25

30

λ

E[W

]

UB M/M(f)/sM/M(f)/s

k=.05

k=.1

k=.2

(b) 2 servers

3 4 5 6 7 80

5

10

15

20

25

30

35

40

λ

E[W

]

UB M/M(f)/sM/M(f)/s

k=.2

k=.05

k=.1

(c) 10 servers

Figure 2 Comparison of expected workload in a simulated M/M(f)/s system versus the derived upper-

bound for s= 1,2, and 10. Inflation is given by a step function: f(m) = k1{m≥s} with k = .05, .1, and

.2.

We observe that the upperbound in Theorem 1 admits a simple analytical expression. This allows us to

generate a clean understanding of the impact of delay on the workload process akin to our understanding

of the role factors such as utilization play in a traditionalM/M/s system. We do this by deriving explicit

expressions for the upperbound.

4.1.1. Exact Expressions and Interpretation To further allow for additional interpretation of our

bound, we leverage established expressions forM/M/s queues to evaluate our bound. We have for an

M/M/s queueing system with arrival rateλ and service rateµ, i.e.ρ= λ/(µs):

π0 =

[

s−1∑

i=0

(sρ)i

i!+

(sρ)s

s!(1− ρ)

]−1

πn =

{

π0(sρ)n

n!, n < s;

π0ρnss

s!, n≥ s.

(4)

The expected work in the systemE[W ] =E[N ]/µ is given as:

E[W ] =sρ

µ+

1

µ

ρ

(1−ρ)2(sρ)s

s!∑s−1

i=0(sρ)i

i!+ (sρ)s

s!(1−ρ)

(5)

Thus, for any number of servers,s, it is possible to compose exact expressions for our upperbound.

To demonstrate this process, we now explicitly evaluate ourbound in two cases: a single server and

two servers. While such a small system may not be generally applicable to an ICU setting, there are

specialized ICUs which can be very small. For instance, in California, the smallest number of licensed

Medical/Surgical ICU beds amongst hospitals with such an ICU is 2 and three hospitals have a 3 bed

ICU (State of California Office of Statewide Health Planning& Development 2010-2011). More generally,

there are other service settings which include a delay effect and have few servers. For instance, Primary

Care may be one such setting (though the delay effect is likely much smaller than in the ED to ICU setting

18

which we are considering here). In our evaluation of explicit expressions, we considerN∗ = s, so that the

workload increases for any job which is delayed. Note that our empirical estimates find that occupancy

levels of 80% have a statistically significant relationshipto increase ED boarding time (delay), which in

turn relates to an increase in ICU LOS. As we are examining theimpact of delay (which is influenced by

occupancy levels), we introduce the delay effect in our queueing system when a job isactuallydelayed, i.e.

whenN∗ = s.

The Single Server Case M/M(f)/1: We want to compare the behavior of theM/M(f)/1 system to

a regularM/M/1 system which does not have any delay effect. We denote the workload in anM/M/1

system with arrival rateλ and service rate1 asW and note that:

E[W ] =ρ

1− ρ

for ρ= λ. For notational consistency, we maintain this definition ofρ= λ throughout the following analysis.

For ourM/M(f)/1 system, we use the result derived in Theorem 1 to establish anupper bound toE[W ],

the expected work in this system:

E[W ]≤WUB =(1+ k)2ρ

1− (1+ k)ρ−λ(2k+ k2)

1

1− (1+ k)ρ=

ρ

1− (1+ k)ρ

We consider the ratio between these two expressions to understand the relative increase in workload due

to the delay effect:WUB

E[W ]=

1− ρ

1− (1+ k)ρ

We can see the precise dependence on the growth factork. Traditional queueing systems assume thatk= 0.

To understand the impact of ignoring the delay effect, we canexamine how the relative workload increases

with k–especially whenk= 0. We have that:

d

dk

(

WUB

E[W ]

)∣

∣

∣

∣

k=0

= ρ1− ρ

(1− (1+ k)ρ)2

∣

∣

∣

∣

k=0

=ρ

1− ρ=E[W ]

Hence, if we use a Taylor series approximation, we have that

WUB

E[W ]≈ 1+E[W ]k

so that the workload in ourM/M(f)/1 system grows quadratically with the expected work in a traditional

M/M/1 system. When considering that the work grows exponentiallyin ρ for a traditionalM/M/1 system,

we see that in our new system with delay dependent service times, the work will grow super exponentially

with ρ.

The Two Server Case M/M(f)/2: We now consider a similar analysis to the single server case when

there are two servers. Because there are two servers, we now define the system loadρ= λ/2 and maintain

19

this definition in what follows. The expected workload of anM/M/2 system with arrival rateλ and service

rate1 is:

E[W ] = 2ρ+

ρ

(1−ρ)2(2ρ)2

2!∑1

i=0(2ρ)i

i!+ (2ρ)2

2!(1−ρ)

=2ρ

1− ρ2(6)

For ourM/M(f)/2 system, we use the upper bound derived in Theorem 1:

E[W ]≤WUB =2(1+ k)2ρ

1− (1+ k)2ρ2− 2ρ

(2k+ k2)(1+2(1+ k)ρ)(1− (1+ k)ρ)

1+ (1+ k)ρ(7)

We consider the ratio between these two expressions to understand the relative increase in workload due

to the delay effect:

WUB

E[W ]=

(1+ k)2(1− ρ2)

1− (1+ k)2ρ2−

(2k+ k2)(1+2(1+ k)ρ)(1− (1+ k)ρ)(1− ρ2)

1+ (1+ k)ρ(8)

Again, we can see the precise dependence on the growth factork. However, it is still cumbersome to fully

understand the impact ofk. To do this, we again utilize the Taylor Series approximation to understand the

impact of introducing the delay impact, i.e. whenk=0. We have that:

d

dk

(

WUB

E[W ]

)∣

∣

∣

∣

k=0

=2ρ2

1− ρ2+6ρ2 +4ρ3 (9)

= ρE[W ] + 6ρ2 +4ρ3

≈

{

ρE[W ], ρ≈ 1;6ρ2 +4ρ3, ρ≈ 0.

When the system is not very loaded, i.e.ρ≈ 0, the polynomial ordered terms,6ρ2 + 4ρ3 dominates in the

derivative. Hence, the relative increase in workload growspolynomially in the system load. We expect the

delay effect to have a much more substantial impact as the system becomes more heavily loaded. Whenρ

is close to1, the first term in the derivative dominates and the workload grows with respect to the expected

amount of work and system load in a delay-independent queueing system. When examining the impact of

the delay effect in conjunction with theM/M/1 case, we see that the delay-effect increases the amount of

work in the system based on the expected workload in a traditionalM/M/s system. In particular,

WUB

E[W ]≈ 1+E[W ]ρk

aroundk = 0 and forρ close to one. Similar to the single server case, we see the delay effect introduces a

quadratic term inE[W ], the expected work of a traditionalM/M/2 system. Figure 3 displays the derivative

in these regimes.

20

0 0.02 0.04 0.06 0.08 0.10

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

ρ

Derivative

6ρ2 + 2ρ3

(a)Comparison of derivative atk= 0 for ρ≈ 0

0.9 0.92 0.94 0.96 0.98 10

20

40

60

80

100

120

140

160

180

200

ρ

Derivativeρ E[W]

(b)Comparison of derivative atk= 0 for ρ≈ 1

Figure 3 Simulation of M/M(f1)/s system

We can see that this bound lets us precisely characterize theimpact of the delay-dependent service times

on the expected workload in the system. We can relate the increase in the expected work in theM/M(f)/s

system for any number of servers,s, to the workload in a system without any delay effect. Certainly, a

more heavily loaded system will experience more delay. Thiswill magnify the impact of the delay-effect.

On the other hand, when the system load is low, the delay-effect will have little impact since few jobs

will experience delays and the subsequent growth in servicerequirement. Because most healthcare systems

operate in a regime where delays happen with relative frequency it is important to understand the impact

of the delay-effect. Our analysis is a first step in understanding how to incorporate delay-dependent service

times into queueing systems and how the delay-effect can impact system behavior.

4.2. A General Upperbound for an M/M(f)/s System

As we saw in Section 2, the delay effect can be gradual. Thus, we now generalize our result from Theorem

1 to other delay-growth functions. Consider any growth function f(·) with a countable number of disconti-

nuities. Let0 =M0 <M1 <M2 < · · ·<MJ−1 <MJ =∞ be break points in the functionf , so that if the

number of jobs in the system upon arrival of a new job satisfyMj−1 ≤N t <Mj, the service rate of that job

is 1/(1+ kj), wherekj ≤ kj+1. Hence,

f(m) = kj, if Mj−1 ≤m<Mj

Thus,f is an arbitrary non-decreasing piece-wise constant function. Note that any growth functionsf can

be expressed via such a discontinuous function.

As we have described before,Wt andNt are defined as the workload and jobs process for this delay-

dependent queueing system. Similarly, letW t be the workload process for anM/M/s system with arrival

21

rateλ and service rate1/(1+kJ), wherekJ =maxj kj. We can then establish the following upperbound to

ourM/M(f)/s system:

Theorem 2If f is a non-decreasing piece-wise constant function withf(m) = kj if Mj−1 ≤m<Mj, we

have that the workload process,Wt, satisfies

E[W ]≤E[W ]−J∑

j=1

[

λ(2kJ + k2J − 2kj − k2

j )P (Mj−1 ≤N t <MJ)]

The proof of this result requires a coupling argument and canbe found in Appendix B. To provide some

insight into the interpretation of this bound, we parse through the two expressions which compose the

upperbound:

1. The first term corresponds to the expected work in the system if all job are inflated maximally to

mean service time1 + kJ = 1 + fmax. Thus, it corresponds to the expected work in anM/M/s system

with ρ= λ(1+kJ)/s. However, most jobs will not be inflated to the maximum size, which brings us to the

second term.

2. The second term corresponds to the correction necessary for over inflating the workload of jobs with

moderate or no wait upon arrival. If this occurs, the work that the new job brings is a factor of1+kj

1+kJless

than the amount of work that arrives in theW t system. Removing this extra work results in the multiplier

of the last expression.

Note that the only time we rely on the exponential service times is to make the algebraic simplification

in Proposition 6 to establish the closed form expression forthe correction term. Hence, the bound can be

extended to general service times, but may not result in as clean expressions.

5. Numerical Comparisons

We now turn our attention to examining the behavior of our delay-dependent queueing system along with

the quality of the derived upperbound. In particular, we wish to examine how this delay effect may impact

a real system. To do this we connect back to our empirical analysis in Section 2 to calibrate our model. We

consider a setting with a fixed number of servers (beds). If a job (patient) arrives and there is an available

server, it is immediately served. If there is no available server, he must wait. We consider the expected

workload in the systems.

5.1. Calibration of Model

We consider a model where patients have a nominal ICU LOS. If anew patient is delayed admission, his

LOS increases by a constant factork. That is, we examine the scenario wheref(m) = k if m ≥ s and0

otherwise. We need to determine the value ofk. To do so, we turn back to our empirical analysis in Section

22

2. Recall that we found when the ICU occupancy is above 80%, wefind that patient delays (ED Boarding)

seems to increase. In turn, longer ED boarding is associatedwith longer ICU LOS. In order to capture this

effect in our simulations, we account for an increase in service time whenever a job is delayed. Note that

because our initial queueing model does not account for the possibility of physicians ‘saving’ ICU beds for

scheduled surgeries or the potential of more severe patients arriving, delays occur only when the occupancy

level is 100%. That is, the service requirement increases whenever a job arrives and all servers are busy.

Given the heterogenous impact on patients, we focus on a single condition category. Our numerical cal-

culations will be based on Cardiac patients. We selected this condition category because i) cardiac patients

demonstrate an increase in ICU LOS when delayed, ii) this is the largest group of patients in the hospital

system we are studying and iii) some hospitals have dedicated cardiac ICUs which primarily treat cardiac

patients. We notice that the mean LOS of Cardiac patients is 37.75 hours, the mean ED Boarding time is

3.57 hours, and each hour of boarding is associated with an increase in ICU LOS by 2.5668 hours.

We note that in our empirical analysis we estimated a linear growth function, which we will be approxi-

mating with a step function. We do this in two ways. First we consider the smallest reasonable delay effect.

In this case, we know that 1 additional hour of boarding time is associated with 2.5668 additional hours in

the ICU. We make this the smallest delay effect possible and define the growth functionf = f1 as:

f1(m) =

{

0.068= 2.5668 additional hours37.75 mean ICU LOS, m≥ s;

0, otherwise.

On the other hand, we notice that the mean ED Boarding time is more than 3 hours. We use this obser-

vation to consider a larger delay effect. In particular, we consider that patients will experience an average

delay of3.57 hours which translates to9.16 = 3.57× 2.5668 extra hours in the ICU. Hence, we consider a

second growth function:

f2(m) =

{

0.243= 9.16 additional hours37.75 mean ICU LOS, m≥ s;

0, otherwise.

We simulate the behavior of these delay-dependent queueingsystems for a small (6 beds) and moderately

sized (15 beds) ICU. We compare the expected workload to three benchmarks:

1. [M/M/s with ρ= λ] This represents a traditional queueing system without delay effects. This is a

(trivial) lower bound to the delay-dependent system.

2. [M/M/s with ρ= λ(1+k)] This represents a queueing system where the amount of work each

job brings is artificially inflated as ifall jobs experienced delays. This is a (trivial) upper bound to the

delay-dependent system.

3. [Upperbound derived in Theorem 1] This corrects for the miscalculation of work for jobs who are

not delayed.

23

We next examine how accurate the approximations are in orderto gain more understanding of the impact

of the delay effect and when it is most important to account for it when using queueing models to provide

insight into various service settings.

5.2. Simulation Results

Figure 4 plots the expected workload,E[W ], for different arrival rates. We make two observations about

the delay-dependent system. First, the upper bound is very accurate. Second, even with this very small

delay effect, we can see the behavior of the system is quite different than that of anM/M/s system. At

low loads, the delay-dependent system looks like anM/M/s system where no jobs are extended; this is

because few jobs, if any are delayed. However, as the system load increases, more jobs are delayed and the

delay-dependent system transitions between theM/M/s system without any job growth to theM/M/s

system with constant job growth. It is clear that ignoring the delay effect can be misleading as to the actual

work in the system.

0 0.5 1 1.5 2 2.5 3 3.5 410

−1

100

101

102

λ (patients/day)

E[W

] (da

ys)

M/M/s (µ = 1)UB M/M(f)/sM/M(f)/sM/M/s (µ = 1/(1+k))

(a) 6 bed ICU

0 1 2 3 4 5 6 7 8 910

−1

100

101

102

λ (patients/day)

E[W

] (da

ys)

M/M/s (µ = 1)UB M/M(f)/sM/M(f)/sM/M/s (µ = 1/(1+k))

(b) 15 bed ICU

Figure 4 Comparison of simulation of M/M(f1)/s system to the derived upperbound as well as traditional

M/M/s systems with no jobs or all jobs are inflated.

In order to get a better sense of the impact of the delay effect, in Figure 5, we examine the difference

in the expected workload of different models compared to a traditionalM/M/s system where no jobs are

extended, i.e.ρ= λ/s. Most ICUs are not operated in a regime where patients are rarely or always delayed,

so we focus on arrival rates where at least a third of the beds turn over each day so there is some, but not

excessive, congestion in our system. Again, we see that our bound is fairly accurate. Moreover, it provides

more insight into the system workload than anM/M/s system where all jobs are inflated. Note that an

24

M/M/s system withµ = 1/(1 + k) precisely characterizes the stability condition for a delay-dependent

queuing system (see Proposition 1). However, the dynamics of the workload are more nuanced.

1.6 1.8 2 2.2 2.4 2.6 2.8 3 3.2 3.40

1

2

3

4

5

6

7

8

λ (patients/day)

∆ E

[W] (

patie

nt d

ays)

M/M/s (µ = 1/(1+k)) − M/M/s (µ = 1)UB M/M(f)/s − M/M/s (µ = 1)M/M(f)/s − M/M/s (µ = 1)

(a) 6 bed ICU

4 4.5 5 5.5 6 6.5 7 7.5 8 8.50

1

2

3

4

5

6

7

8

9

λ (patients/day)

∆ E

[W] (

patie

nt d

ays)


(b) 15 bed ICU

Figure 5 Simulation of M/M(f1)/s system: Difference in workload compared to a standard M/M/s system

with ρ= λs

. Here the growth factor is 6.8%.

Figure 6 considers the increase in expected workload when the delay-effect is much larger. In this case,

being delayed increases a patient’s ICU LOS by nearly 25% corresponding to patients seeing the average

delay of 3.57 hours. We notice that the upper bound is slightly looser. This is because the upper bound only

corrects the work a single job brings in, but not the propagation effect it has on delaying/not delaying future

jobs. This propagation is more substantial when the delay-effect is larger. Still, we can see the upper bound

is a better measure of system load than the naive upper bound of anM/M/s system withρ= λ(1+k)

s, i.e.

all jobs are extended.

Through our simulations, we can see that our derived upper bound can be quite accurate. Moreover, we

see that the expected workload for ourM/M(f)/s system is very different when comparing to a system

without a delay effect. Ignoring the impact delays may have on service times may result in poor capacity

management and substantial under provisioning when using traditional queueing models to guide such

decisions. It is especially important to consider the delayeffect when the system is heavily loaded and most

jobs tend to experience some delay. Without accounting for the delay effect, a hospital ICU can become

even more congested. In order to manage this increase in system load, hospitals may have to cancel surgeries

and/or divert ambulances to reduce patient arrivals at a substantial loss in revenue. As the delay effect seems

to be prevalent in a number of healthcare settings, reconsidering the management of these systems in light

of delay sensitive service times may result in substantial operational and medical care improvements.

25

1.4 1.6 1.8 2 2.2 2.4 2.6 2.80

2

4

6

8

10

12

14

16

λ (patients/day)

∆ E

[W] (

patie

nt d

ays)


(a) 6 bed ICU

3.5 4 4.5 5 5.5 6 6.5 70

2

4

6

8

10

12

14

16

18

20

λ (patients/day)

∆ E

[W] (

patie

nt d

ays)


(b) 15 bed ICU

Figure 6 Simulation of M/M(f2)/s system (increased delay effect): Difference in workload co mpared to a

standard M/M/s system with ρ= λs

. Here the growth factor is 24.39%.

6. Conclusion

To summarize, this work quantifies a relatively unstudied queuing phenomenon in a critical care setting – the

impact of delays on care requirements. We see that this natural phenomenon is substantially verified by data

and attempt to incorporate the phenomenon into simple queueing models. The impact of this phenomenon

is comparable with moderate service provisioning adjustments (which are expensive and can have dramatic

impact) and, as such, warrants careful attention.

Analyzing queueing systems with delay-dependent service times exactly can be cumbersome and

intractable. As such, we focus on the development of reasonable approximations for the system workload.

We find that 1) our approximations are quite accurate and 2) they provide expressions which allow for inter-

pretations related to increases in system load. We find that ignoring the delay effect when using queueing

models to guide operational decision making may result in substantial under provisioning of resources such

as beds, nurses, and physicians. Moreover, because the delay effect can be quite substantial, disregarding

it may impede future attempts to make ICUs more efficient and effective. Incorporating a delay effect will

result in more accurate estimates of system dynamics as wellas targets for system improvement.

While we don’t expect our models to directly translate into new capacity management criteria for hospital

ICUs, we hope that this analysis demonstrates the impact of ignoring the delay effects when making such

decisions. By ignoring the delay effects, ICUs continue to be highly congested. Such congestion can lead

to other reactive actions such as rerouting (Kim et al. 2012), patient speedup (Kc and Terwiesch 2012), and

ambulance diversion (Allon et al. 2013), which can also be detrimental to patient outcomes. From both a

patient as well as systems level perspective, it is desirable to reduce delays. While reducing the average ED

26

boarding time by an hour may be practically difficult, the adverse feedback of delays on increased service

requirements suggests that even small reductions in boarding time on the order of 10 to 15 minutes may

help reduce congestion.

ReferencesAllon, G., S. Deo, W. Lin. 2013. The impact of hospital size and occupancy of hospital on the extent of ambulance

diversion: Theory and evidence.Operations Research61 554–562.

Anand, K., M. F. Pac, S. Veeraraghavan. 2010. Quality-SpeedConundrum: Tradeoffs in Customer-Intensive Services.

Management Science57 40–56.

Anderson, D., C. Price, B. Golden, W. Jank, E. Wasil. 2011. Examining the discharge practices of surgeons at a large

medical center.Health Care Management Science1–10.

Armony, M., C. Maglaras. 2004. Contact centers with a call-back option and real-time delay information.Operation

Research52 527–545.

Ata, B., S. Shnerson. 2006. Dynamic Control of an M/M/1 Service System with Adjustable Arrival and Service Rates.

Management Science52 1778–1791.

Batt, R.J., C. Terwiesch. 2012. Doctors under load: An empirical study of state-dependent service times in emergency

care.Working Paper, The Wharton School.

Boxma, O.J., M. Vlasiou. 2007. On queues with service and interarrival times depending on waiting times.Queueing

Systems56 121–132.

Buist, M.D., G.E. Moore, S.A. Bernard, B.P. Waxman, J.N. Anderson, T.V. Nguyen. 2002. Effects of a medical emer-

gency team on reduction of incidence of and mortality from unexpected cardiac arrests in hospital: preliminary

study.British Medical Journal324 387–390.

Burt, C.W., S.M. Schappert. 2004. Ambulatory care visits tophysician offices, hospital outpatient departments, and

emergency departments: United States, 1999-2000.Vital Health Stat.13(157) 1–70.

Chalfin, D. B. 2005. Length of intensive care unit stay and patient outcome: The long and short of it all.Critical Care

Medicine33 2119–2120.

Chalfin, D. B., S. Trzeciak, A. Likourezos, B. M. Baumann, R. P. Dellinger. 2007. Impact of delayed transfer of

critically ill patients from the emergency department to the intensive care unit.Critical Care Medicine35

1477–1483.

Chan, C. W., V. F. Farias, N. Bambos, G. Escobar. 2012. Optimizing icu discharge decisions with patient readmissions.

Operations Research60 1323–1342.

Chan, P.S., H.M. Krumholz, G. Nichol, B.K. Nallamothu. 2008. Delayed time to defibrillation after in-hospital cardiac

arrest.New England Journal of Medicine358(1) 9–17.

de Luca, G., H. Suryapranata, J.P. Ottervanger, E.M. Antman. 2004. Time delay to treatment and mortality in primary

angioplasty for acute myocardial infarction: every minuteof delay counts.Circulation 109(10) 1223–1225.

27

de Vericourt, F., O. B. Jennings. 2011. Nurse Staffing in Medical Units: A Queueing Perspective.Operations Research

59 1320–1331.

Dobson, G., H.H. Lee, E. Pinker. 2010. A model of ICU bumping.Operations Research58 1564–1576.

Durrett, R. 1996.Probability: Theory and Examples. Duxbury Press.

Escobar, G. J., J. D. Greene, P. Scheirer, M. N. Gardner, D. Draper, P. Kipnis. 2008. Risk-adjusting hospital inpatient

mortality using automated inpatient, outpatient, and laboratory databases.Medical Care46 232–239.

George, J. M., J. M. Harrison. 2001. Dynamic control of a queue with adjustable service rate.Operations Research

49(5) 720–731.

Ghahramani, S. 1986. Finiteness of moments of partial busy periods for m/g/c queues.Journal of Applied Probability

23(1) 261–264.

Green, L. 2006. Queueing analysis in healthcare. Randolph W. Hall, ed.,Patient Flow: Reducing Delay in Healthcare

Delivery, International Series in Operations Research & Management Science, vol. 91. Springer US, 281–307.

Green, L. V., J. Soares, J. F. Giglio, R. A. Green. 2006. Usingqueuing theory to increase the effectiveness of emergency

department provider staffing.Academic Emergency Medicine13 61–68.

Halpern, N. A., S. M. Pastores. 2010. Critical care medicinein the united states 2000-2005: An analysis of bed

numbers, occupancy rates, payer mix, and costs.Critical Care Medicine38 65–71.

Kc, D., C. Terwiesch. 2009. Impact of workload on service time and patient safety: An econometric analysis of

hospital operations.Management Science55 1486–1498.

Kc, D., C. Terwiesch. 2012. An econometric analysis of patient flows in the cardiac intensive care unit.Manufacturing

& Service Operations Management14 50–65.

Kim, S-H, C. W. Chan, M. Olivares, G. Escobar. 2012. Managinginpatient units: An empirical study of capacity

allocation and its implication on service outcomes.Working Paper, Columbia Business School.

Litvak, E., M.C. Long, A.B. Cooper, M.L. McManus. 2001. Emergency department diversion: causes and solutions.

Acad Emerg Med8 1108–1110.

Mandelbaum, A., W. Massey, M. Reiman. 1998. Strong approximations for markovian service networks.Queueing

Systems30(1-2) 149–201.

Mandelbaum, A., G. Pats. 1998. State-dependent stochasticnetworks. part i: Approximations and applications with

continuous diffusion limits.The Annals of Applied Probability8(2) 569–646.

Moreno, R.P., P. G. Metnitz, E. Almeida, B. Jordan, P. Bauer,R.A. Campos, G. Iapichino, D. Edbrooke, M. Capuzzo,

J.R. Le Gall. 2005. SAPS 3–From evaluation of the patient to evaluation of the intensive care unit. Part 2:

Development of a prognostic model for hospital mortality atICU admission.Intensive Care Med31 1345–1355.

Norton, S.A., L.A. Hogan, R.G. Holloway, H. Temkin-Greener, M.J. Buckley, T.E. Quill. 2007. Proactive palliative

care in the medical intensive care unit: effects on length ofstay for selected high-risk patients.Crit Care Med

35 1530–1535.

28

Powell, S. G., K. L. Schultz. 2004. Throughput in Serial Lines with State-Dependent Behavior.Management Science

50 1095–1105.

Rapoport, J., D. Teres, S. Lemeshow. 1996. Resource use implications of do not resuscitate orders for intensive care

unit patients.Am J Respir Crit Care Med153 185–190.

Renaud, B., A. Santin, E. Coma, N. Camus, D. Van Pelt, J. Hayon, M. Gurgui, E. Roupie, J. Herve, M.J. Fine, C. Brun-

Buisson, J. Labarere. 2009. Association between timing ofintensive care unit admission and outcomes for

emergency department patients with community-acquired pneumonia. Critical Care Medicine37(11) 2867–

2874.

Rivera, A., J. F. Dasta, J. Varon. 2009. Critical Care Economics. Critical Care & Shock12 124–129.

Rivers, E., B. Nguyen, S. Havstad, J. Ressler, A. Muzzin, B. Knoblich, E. Peterson, M. Tomlanovich. 2001. Early

goal-directed therapy in the treatment of severe sepsis andseptic shock.New England Journal of Medicine

345(19) 1368–1377.

Sheridan, R., J Wber, K Prelack, L. Petras, M. Lydon, R. Tompkins. 1999. Early burn center transfer shortens the

length of hospitilization and reduces complications in children with serious burn injuries.J Burn Care Rehabil

20 347–50.

Shi, P., M. C. Chou, J. G. Dai, D. Ding, J. Sim. 2012. Hospital Inpatient Operations: Mathematical Models and

Managerial Insights.Working Paper, Georgia Institute of Technology.

State of California Office of Statewide Health Planning & Development. 2010-2011. Annual Financial Data. URL

http://www.oshpd.ca.gov/HID/Products/Hospitals/AnnFinanData/CmplteDataSet/index.asp.

Thompson, S., M. Nunez, R. Garfinkel, M.D. Dean. 2009. Efficient short-term allocation and reallocation of patients

to floors of a hospital during demand surges.Operations Research57(2) 261–273.

Whitt, W. 1990. Queues with service times and interarrival times depending linearly and randomly upon waiting times.

Queueing Systems6 335–352.

Whitt, W. 2003. How multiserver queues scale with growing congestion-dependent demand.Queueing Systems51

531–542.

Wilper, A.P., S. Woolhandler, K.E. Lasser, D. McCormick, S.L. Cutrona, D.H. Bor, D.U. Himmelstein. 2008. Waits to

see an emergency department physician: U.S. trends and predictors, 1997-2004.Health Affairs27 w84–95.

Yankovic, N., S. Glied, L.V. Green, M. Grams. 2010. The impact of ambulance diversion on heart attack deaths.

Inquiry 47 81–91.

Yankovic, N., L. Green. 2011. Identifying Good Nursing Levels: A Queuing Approach.Operations Research59

942–955.

Zimmerman, J. E., A. A. Kramer, D.S. McNair, F. M. Malila. 2006. Acute Physiology and Chronic Health Evaluation

(APACHE) IV: hospital mortality assessment for today’s critically ill patients.Crit Care Med34 1297–1310.

http://www.oshpd.ca.gov/HID/Products/Hospitals/AnnFinanData/CmplteDataSet/index.asp

29

Appendix A: Miscellaneous Proofs

PROOF OFPROPOSITION1:

Stability: First, we show that ifλs≤ 1

1+fmax, then the system is rate stable. This follows by examining a traditional

M/M/s system with arrival rateλ and mean service requirement1 + fmax = 1 + maxm f(m). By coupling the

arrivals of this system and the service times so that if the mean service requirement in our delay-dependent system is

σ ≤ 1+ fmax, its service requirement isσX and the service requirement in theM/M/s system is(1+ fmax)X where

X is a mean1, exponentially distributed random variable. It is easy to see that thisM/M/s system is an upperbounding

system to our delay-dependent system. Hence, if the upperbounding system is stable, so is theM/M(f)/s system.

The stability condition for this upperbounding system is the desired criteria.

Instability: We now show that ifλs> 1

1+fmax, then the system is unstable. We do this in two steps: 1) we show that

from any initial state, there is a non-zero probability thatthe time until theM/M(f)/s system will reach the state

where the number of jobs in the system is such that the servicetime of a new arrival would be maximally inflated and

all the jobs in the system have been been delayed enough that their service rate is maximal is finite 2) we establish the

transience of this state which will establish that our resulting system is transient and, hence, unstable.

We define the following notation: LetNfmax=min{N : f(N) =maxn f(n)} be the minimum number of jobs in the

system such that the service time for a new job is inflated maximally. Our state at timet can be described by theNfmax-

dimensional vector,Zt, where(Zt)n is the number of jobs in the system which sawn jobs when it arrived (ZNfmaxis

the number of jobs which seeNfmaxor more jobs in the system). LetTxy = inf{t > 0 : Zt = y|Z0 = x} be the time to

first passage to statey given we start in statex at time0. Finally, we define the state with exactlyN =max(Nfmax, s)

jobs in the system, all of whose service time is maximally inflated asS∗ = {Z :ZNfmax=∑

nZn = N}.

We begin by showing that the time to reach stateS∗ is finite with non-zero probability from any initial state.

Specifically, we will show that for any statex, P (TxS∗ < ∞) > 0. Consider a system which starts at statex, i.e.

Z0 = x. LetNx be the number of jobs in the system in statex. We start with assumingNx <Nfmax+ N . Our goal is

to find the first time to stateS∗. One way to get toS∗ is to haveN +Nfmax−Nx jobs arrive before any job departs

the system and then haveNfmax−Nx jobs depart from the system before another job arrives. Thus, the probability of

this particular sample path occuring, which we denote asE, can be lower bounded by:

P (E)>

(

λ

λ+ sµmax

)N+Nfmax−Nx

(

sµmin

λ+ sµmax

)Nfmax

> 0

Moreover, the time it takes for this cascade of events to occur is upper bounded by the sum ofN + 2Nfmax−Nx,

mean1/(λ+sµmin) exponentially distributed random variables. Specifically, the time has a gamma distributionTE ∼

Γ(N +2Nfmax−Nx,1/(λ+ sµmin)), which is finite with non-zero probability. Hence, we have that:

P (TxS∗ <∞)>P (E)P (TE <∞)> 0

Note that ifNx >Nfmax+ N , we simply need thatNx − N jobs must depart before the next arrival. Using the same

argument as above, we can show thatP (TxS∗ <∞)> 0, for anyx.

Next, we demonstrate that the recurrence time for stateS∗ is infinite with non-zero probability, i.e.P (TS∗S∗ <

∞)< 1. To do this, we will leverage the fact that a standardM/M/s queueing system withρ= λsµmin

= λ(1+fmax)

s> 1

30

is unstable, and hence, transient. We consider two states inthisM/M/s system: statey, with N jobs in the system,

and statey+, with N +1 jobs in the system. Because thisM/M/s system is transient, the time to first passage from

y+ to y satisfies the following:P (TM/M/s

y+y<∞)< 1. Here we use the superscriptM/M/s to differentiate from the

first passage time of our delay dependentM/M(f)/s system,Txy.

We leverage the the preceding observation and decompose therecurrence timeTS∗S∗ into whether the next event is

an arrival or departure with the new state denoted byy+ andy−, respectively:

P (TS∗S∗ <∞) =sµmin

λ+ sµmin

P (Ty−S∗ <∞)+λ

λ+ sµmin

P (Ty+S∗ <∞)

≤sµmin

λ+ sµmin

+λ

λ+ sµmin

P (Ty+S∗ <∞)< 1

The last inequality comes from the observation that all jobswhich arrive to the system will see at leastN ≥ Nfmax

jobs in the system before the system hits stateS∗; hence, they will have service time exponentially distributed with

mean1+ fmax = 1/µmin. Hence, the dynamics of ourM/M(f)/s system are identical to theM/M/s system with

arrival rateλ and service rateµmin during the trajectory to the first visit to stateS∗ from statey+. Because theM/M/s

system is transient, stateS∗ is also transient in ourM/M(f)/s system.

By Theorem 3.4 in Durrett (1996), all states in ourM/M(f)/s system are transient since the time to reach a

transient state (y ∈ S∗) is finite with non-zero probability for all states. Hence, theM/M(f)/s queue is unstable. 2

Appendix B: Proof of Theorem 2

We now proceed with the proof of our main result. The proof will examine the case of Theorem 1, which assumes that

the growth functionf is defined as:

f(m) =

{

0, m<N∗;k, m≥N∗.

We note that the generalized result for Theorem 2 will followsimilarly. The only changes required are additional

notation and book keeping to keep track of each breakpoint inthe growth function,f . The proof will proceed in several

steps. Again we will refer to ourM/M(f)/s system as system 1 and anM/M/s system with arrival rateλ and service

rate1/(1+ k) as system 2.

Coupling: To begin we will construct a natural coupling between theM/M(f)/s andM/M/s systems above. In

particular, we assume that both systems see a common arrivalprocess. With an abuse of notation, let the service time

for the ith arriving job in the latter system beW i; the corresponding service time in the delay dependent system is

then eitherWi = W i/(1 + k) or Wi = W i depending on whether the delay dependent system has low congestion

(Nt−i<N∗) or is considered busy (Nt−

i≥N∗) upon the arrival of theith job. Finally, we assume that both systems start

empty. Now letτi (τ i) denote the amount of time theith arriving job waits in the former (latter) system respectively

before beginning service. We have, as a consequence of our coupling, the following elementary result:

Proposition 3τi ≤ τ i for all i. Moreover,Nt ≤N t for all t.

PROOF: We prove the first statement. Proceeding by induction observe that the statement is true fori= 1: τ1 = τ1 =

0. Assume the statement true fori= l− 1 and consideri= l. For the sake of contradiction assume thatτl > τ l. Since

the service discipline is FIFO in both systems, it follows that when jobl starts service in system 2:

31

• There are at mosts− 1 jobs from among the firstl− 1 arriving jobs present in system 2.

• Simultaneously, atleasts jobs from among the firstl− 1 arriving jobs are still present in system 1 since jobl has

not yet started service in system 1.

Consequently, given the induction hypothesis and the fact that by our couplingWi ≤W i for i= 1,2, . . . , l− 1, there

is a job among the firstl − 1 arrivals that finished service strictly earlier in system2 than in system1. This is a

contradiction. We have consequently established thatτi ≤ τ i for all i. The latter statement follows as a simple corollary.

2

We next use the result above to construct a first upper bound. We have:

Proposition 4

Wiτi +1

2W 2

i ≤W iτ i +1

2W

2

i − 1{

N t−i<N∗

} 1

2W

2

i

(

2k+ k2

(1+ k)2

)

PROOF: We begin with two elementary observations. First,

Wi ≤W i

always under our coupling and, in particular, ifN t−i≥N∗. Further

Wi ≤W i

1+ k

if N t−i<N∗. This follows from the fact thatN t >Nt (Proposition 3), so thatN t−

i<N∗ impliesNt−

i<N∗. It follows

that

Wiτi +1

2W 2

i ≤W iτ i +1

2W 2

i

≤W iτ i + 1{

N t−i≥N∗

} 1

2W

2

i + 1{

N t−i<N∗

} 1

2

W2

i

(1+ k)2

=W iτ i +1

2W

2

i − 1{

N t−i<N∗

} 1

2W

2

i

(

2k+ k2

(1+ k)2

)

The first inequality follows from the fact thatWi ≤ W i (by our coupling) andτi ≤ τ i (Proposition 3). The second

inequality follows from the two observations we made at the outset. 2

We next connect this result to the average workload in both systems (over a finite interval). LetN(T ) be the number

of jobs that have arrived duringt∈ [0, T ]. We have:

Proposition 5

1

T

∫ T

0

Wtdt≤1

T

∫ T

0

W tdt−1

T

N(T)∑

i=1

1{

N t−i<N∗

} 1

2W

2

i

(

2k+ k2

(1+ k)2

)

+W T

T

PROOF: Notice that the total workload contributed by jobi over time in system 1 is given by the quantityWiτi +

12W 2

i where the first term in the sum corresponds to the workload contributed while jobi waits, and the latter term

corresponds to the workload contributed while jobi is in service. We consequently have:

1

T

∫ T

0

Wtdt≤1

T

N(T)∑

i=1

(

Wiτi +1

2W 2

i

)

≤1

T

N(T)∑

i=1

(

W iτ i +1

2W

2

i

)

−1

T

N(T)∑

i=1

1{

N t−

i<N∗

} 1

2W

2

i

(

2k+ k2

(1+ k)2

)

=1

T

∫ T

0

W tdt+W T

T−

1

T

N(T)∑

i=1

1{

N t−i<N∗

} 1

2W

2

i

(

2k+ k2

(1+ k)2

)

32

2

Note that the last equality comes from the fact that not all ofthe work which arrives between[0, T ] is completed by

timeT ; henceWT remains. What remains is to take limits on both sides of the inequality established in the previous

result. To that end we begin with a few intermediary results.First, we provide a few definitions. We letE[W ] and

E[W ] be the expected work in ourM/M(f)/s system and anM/M/s system withρ= λ(1+k)

s, respectively.

Lemma 1limT1T

∫ T

0Wtdt=E[W ]

PROOF: This result follows directly from the renewal reward theorem and the fact that the system is stable. The

reward function is the cumulative work and is defined as:R(t) =∫ t

0Wτdτ 2

Lemma 2limT1T

∫ T

0W tdt=E[W ]

PROOF: Again, this result follows directly from the renewal reward theorem and the fact that the system is stable.

The reward function is the cumulative work and is defined as:R(t) =∫ t

0W τdτ 2

Lemma 3limT1T

∫ T

01{N t <N∗}dt=P (N t <N∗)

PROOF: Again, this result follows directly from the renewal reward theorem and the fact that the system is stable. The

reward function is the total time the number of jobs in the system is less thanN∗ and is defined as:R(t) =∫ t

01{Nτ <

N∗}dτ 2

Lemma 4limTWT

T= 0

PROOF: This follows from the fact that the system is stable and thusrecurrent. If we consider thatWT is upper-

bounded by the amount of work that arrives between[T ∗0 (T ), T ], whereT ∗

0 (T ) = sup{t < T :Wt = 0} is the last time

beforeT , the system was empty, then the fact that the system is recurrent establishes thatP (T − T ∗0 (T )<∞) = 1.

Assuming a finite first moment forWi gives the desired result. 2

We next establish a limit for the second term on the right handside of the inequality in Proposition 4.

Proposition 6

limT

1

T

N(T)∑

i=1

1{

N t−i<N∗

} 1

2W

2

i

(

2k+ k2

(1+ k)2

)

= λ(2k+ k2) limT

1

T

∫ T

0

1{

N t <N∗}

dt

PROOF: Let use denote, for notational convenience,

1

2EW

2

i

(

2k+ k2

(1+ k)2

)

= 2k+ k2 ,α.

and1

2

(

2k+ k2

(1+ k)2

)

, β.

We begin with observing that

limT

1

T

N(T)∑

i=1

1{

N t−i<N∗

}

α= λα limT

1

T

∫ T

0

1{

N t <N∗}

dt (10)

33

by PASTA. Next, observe that

E

[

limT

1

T

N(T)∑

i=1

1{

N t−i<N∗

}

α

]

= limT

1

TE

[

N(T)∑

i=1

1{

N t−i<N∗

}

α

]

= limT

1

T

N(T)∑

i=1

E

[

1{

N t−i<N∗

}

α]

= limT

1

T

N(T)∑

i=1

E

[

1{

N t−i<N∗

}]

EW2

i β

= limT

1

T

N(T)∑

i=1

E

[

1{

N t−i<N∗

}

W2

i

]

β

= limT

E

[

1

T

N(T)∑

i=1

1{

N t−i<N∗

}

W2

i

]

β

(11)

The first equality above follows by dominated convergence (using the dominating random variableN(T )/T ). The

fourth equality (which is crucial) follows since1{

N t−i<N∗

}

andW2

i are independent random variables. Recall

these are defined for the standardM/M/s system. Now, sincelimT1T

∫ T

01{

N t <N∗}

dt is a constant (by Lemma

3), (10) and (11) together yield

limT

E

[

1

T

N(T)∑

i=1

1{

N t−i<N∗

}

W2

i

]

β = λα limT

1

T

∫ T

0

1{

N t <N∗}

dt.

But from Lemma 5, which will come in Appendix B.1

limT

E

[

1

T

N(T)∑

i=1

1{

N t−i<N∗

}

W2

i

]

β = E

[

limT

1

T

N(T)∑

i=1

1{

N t−i<N∗

}

W2

i

]

β

= limT

1

T

N(T)∑

i=1

1{

N t−i<N∗

}

W2

i β

Using Lemmas 1 and 2 to replace the limit with expectations gives the desired result. This completes the proof.2

B.1. Existence of a Limit

Thepartial busy periodof anM/G/s queue is defined as the time between when an arriving customersees an empty

system and the first time after that at which a departing customer sees an empty system. We will use the following

result:

Theorem 3 (Ghahramani (1986))Themth moments of the partial busy period of anM/G/s queue are finite if and

only if the service time distribution has finitemth moments.

We denote byTm the lengthmth partial busy period. We can now establish:

Lemma 5Assume the service time distribution has finite fourth moments. Then,

limT

1

T

N(T)∑

i=1

1{

N t−i<N∗

}

W2

i

exists and equals a constant. Further,

limT

E

[

1

T

N(T)∑

i=1

1{

N t−i<N∗

}

W2

i

]

β = E

[

limT

1

T

N(T)∑

i=1

1{

N t−i<N∗

}

W2

i

]

β

34

PROOF: We first establish that

limT

1

T

N(T)∑

i=1

1{

N t−i<N∗

}

W2

i β

exists and is constant. To see this denote by1 = j1 < j2 < j3 . . . the arrivalsi for whichN t−i= 0. Observe that the

random variables

Xm ,

jm+1−1∑

i=jm

1{

N t−i<N∗

}

W2

i

are independent random variables. Moreover,∑jm+1−1

i=jm1{

N t−i<N∗

}

W2

i ≤ s2T 2m.Note that since we have assumed

the service time distribution has finite fourth moments, we have ET 4m < ∞. Now let M(T ) = sup{l|Ajl ≤ T };

M(T )→∞. The strong law of large numbers then implies that

limT

∑M(T)i=1 Xm

M(T )

exists and is a constant a.s. Further, a simple argument using Chebyshev’s inequality and the Borel Cantelli lemma

implies that

limT

XM(T)

T= 0 a.s.

Finally, the elementary renewal theorem implies thatlimTM(T)

T= 1/ET1. But,

∑M(T)i=1 Xm

M(T )

M(T )

T−

XM(T)

T≤

1

T

N(T)∑

i=1

1{

N t−i<N∗

}

W2

i ≤

∑M(T)i=1 Xm

M(T )

M(T )

T+

XM(T)

T

so that taking limits throughout and employing the above observations yields the first conclusion of the Lemma.

Now to establish the second conclusion, observe that

N(T)∑

i=1

1{

N t−i<N∗

}

W2

i ≤

N(T)∑

i=1

W2

i

and that

E

N(T)∑

i=1

W2

i = EN(T )EW 2i = λTEW 2

i

where the first equality is Wald’s identity. Consequently, we may apply the conclusion of the first part of the theorem

along with the dominated convergence theorem to establish the second conclusion of the theorem.

2

the impact of delays on service times in the intensive ...cc3179/ed2icudelay_2013.pdfthe impact of...

Documents