martingale proofs of many-server heavy ...ww2040/infiniteserver030407.pdftingale method for proving...

MARTINGALE PROOFSOF MANY-SERVER HEAVY-TRAFFIC LIMITS

FOR MARKOVIAN QUEUES

by

Ward Whitt

Department of Industrial Engineering and Operations ResearchColumbia University

New York, NY 10027-6699Email: [email protected]

URL: www.columbia.edu/∼ww2040

March 4, 2007

c©Ward Whitt

Abstract

This is an expository paper illustrating the “martingale method” for proving many-serverheavy-traffic stochastic-process limits for queueing models, which support diffusion approxi-mations. Specifically, here the martingale method is applied to treat the M/M/∞, M/M/n/∞(Erlang-C), M/M/n/0 (Erlang-B), M/M/n + M (Erlang-A, with +M indicating Markoviancustomer abandonment) and M/M/n/mn + M models. The Markovian stochastic processrepresenting the number of customers in the system is constructed in terms of rate-1 Poissonprocesses in two ways: (i) through random time changes and (ii) through random thinnings.Associated martingale representations are obtained for these constructions by applying, re-spectively: (i) optional sampling theorems where the random time changes are the stoppingtimes and (ii) the integration theorem associated with random thinning of a counting process.Convergence to the diffusion process limit for the appropriate sequence of scaled queueingprocesses is obtained both by applying martingales and without applying martingales. Allmethods of proof exploit the continuous mapping theorem. In addition to reviewing the mar-tingale method for proving stochastic-process limits for queueing models, this paper reviewstightness criteria and the proof of the martingale functional central limit theorem.

Keywords: many-server heavy-traffic limits for queues, diffusion approximations, martingales,functional central limit theorems, martingale functional central limit theorem, invariance prin-ciples, tightness, stochastic boundedness, continuous mapping theorem.

Contents

1 Introduction 11.1 The Classic M/M/∞ Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 The QED Many-Server Heavy-Traffic Limiting Regime . . . . . . . . . . . . . . 31.3 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.4 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Sample-Path Constructions 62.1 Random Time Change of Unit-Rate Poisson Processes . . . . . . . . . . . . . . 62.2 Random Thinning of Poisson Proceses . . . . . . . . . . . . . . . . . . . . . . . 8

3 Martingale Representations 83.1 Martingale Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.2 Quadratic Variation and Covariation Processes . . . . . . . . . . . . . . . . . . 113.3 Counting Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.4 The First Martingale Representation . . . . . . . . . . . . . . . . . . . . . . . . 133.5 The Second Martingale Representation . . . . . . . . . . . . . . . . . . . . . . . 173.6 The Third Martingale Representation . . . . . . . . . . . . . . . . . . . . . . . 19

4 The Main Steps in the Proof of Theorem 1.1 204.1 The Integral Representation Is Continuous . . . . . . . . . . . . . . . . . . . . . 204.2 A Poisson FCLT Plus the CMT . . . . . . . . . . . . . . . . . . . . . . . . . . . 234.3 The Direct Fluid Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5 Tightness and Stochastic Boundedness 275.1 Tightness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275.2 Stochastic Boundedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

5.2.1 Connection to Tightness . . . . . . . . . . . . . . . . . . . . . . . . . . . 295.2.2 Preservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.2.3 Stochastic Boundedness for Martingales . . . . . . . . . . . . . . . . . . 315.2.4 FWLLN from Stochastic Boundedness . . . . . . . . . . . . . . . . . . . 32

5.3 Tightness Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335.3.1 Classic Criteria Involving a Modulus of Continuity . . . . . . . . . . . . 335.3.2 Simplifying the Modulus Condition . . . . . . . . . . . . . . . . . . . . . 345.3.3 Criteria Involving Stopping Times and Quadratic Variations . . . . . . . 37

6 Completing the Proof of Theorem 1.1. 396.1 The Fluid Limit from Stochastic Boundedness of Xn in D . . . . . . . . . . . 396.2 Stochastic Boundedness of the Predictable Quadratic Variations . . . . . . . . 406.3 The Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

7 Other Models 417.1 The Erlang A Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 417.2 Finite Waiting Rooms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467.3 General Non-Markovian Arrival Processes . . . . . . . . . . . . . . . . . . . . . 48

1

8 The Martingale FCLT 508.1 The Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 518.2 Proof of the Martingale FCLT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

8.2.1 Quick Proofs of C-Tightness . . . . . . . . . . . . . . . . . . . . . . . . 538.2.2 The Ethier-Kurtz Proof of Tightness in Case (i). . . . . . . . . . . . . . 548.2.3 The Ethier-Kurtz Proof of Tightness in Case (ii). . . . . . . . . . . . . . 588.2.4 Characterization of the Limit with Uniformly Bounded Jumps. . . . . . 598.2.5 The Ethier-Kurtz Characterization Proof in Case (i). . . . . . . . . . . . 638.2.6 The Ethier-Kurtz Characterization Proof in Case (ii). . . . . . . . . . . 65

9 Applications of the Martingale FCLT 679.1 Proof of the Poisson FCLT: Theorem 4.2. . . . . . . . . . . . . . . . . . . . . . 679.2 Third Proof of Theorem 1.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

10 Concluding Remarks 69

11 Acknowledgments. 70

A Proof of Lemma 3.1. 75

2

1. Introduction

This is an expository review paper, intended to illustrate how to do martingale proofs ofmany-server heavy-traffic limit theorems for queueing models, as in Krichagina and Puhalskii(1997) and Puhalskii and Reiman (2000). Even though the method is remarkably effective,it is somewhat complicated. Thus it is helpful to see how the argument applies to easy casesbefore considering more elaborate models.

For the more elementary Markovian models considered here, these many-server heavy-trafficlimits were originally established by other methods, but better methods of proof keep gettingdeveloped. For the more elementary models, we show how the proofs can be done withoutmartingales as well as with martingales. For that, we follow Mandelbaum and Pats (1995),but without their level of generality and without discussing strong approximations. Eventhough we focus on elementary Markov models here, we are motivated by the desire to treatmore complicated models, such as the non-Markovian GI/GI/∞ and GI/PH/n models inKrichagina and Puhalskii (1997) and Puhalskii and Reiman (2000), and network generalizationsof these. Thus we want to do more than achieve the fastest proof for the simple models, whichdoes not rely so much on martingales; we want to illustrate other martingale methods thathave proved useful for more complicated models.

In addition to reviewing the martingale method for proving stochastic-process limits forMarkovian queueing models, we review the martingale functional central limit and its proof, asin §7.1 of Ethier and Kurtz (1986) and Jacod and Shiryayev (1987). We also review tightnesscriteria, which play an important role in the proofs.

1.1. The Classic M/M/∞ Model

We start by focusing on what is a candidate to be the easiest queueing model – the classicM/M/∞ model – which has a Poisson arrival process (the first M), i.i.d. exponential servicetimes (the second M), independent of the arrival process, and infinitely many servers. Let thearrival rate be λ and let the individual mean service time be 1/µ.

Afterwards, in §7.1, we treat the associated M/M/n/∞+ M (Erlang A) model, which hasn servers, unlimited waiting room, the first-come first-served (FCFS) service discipline andMarkovian customer abandonment (the final +M); customers abandon (leave) if they haveto spend too long waiting in queue. The successive customer times to abandon are assumedto be i.i.d. exponential random variables, independent of the history up to that time. Thatassumption is often reasonable with invisible queues, as in telephone call centers. Limits forthe associated M/M/n/∞ (Erlang C) model (without customer abandonment) are obtainedas a corollary, by simply letting the abandonment rate be zero.

For the initial M/M/∞ model, let Q(t) be the number of customers in the system attime t, which coincides with the number of busy servers at time t. It is well known thatQ(t) : t ≥ 0 is a birth-and-death stochastic process and that Q(t) ⇒ Q(∞) as t →∞, where⇒ denotes convergence in distribution, provided that P (Q(0) < ∞) = 1, and Q(∞) has aPoisson distribution with mean E[Q(∞)] = λ/µ.

We are interested in heavy-traffic limits in which the arrival rate is allowed to increase.Accordingly, we consider a sequence of models indexed by n and let the arrival rate in modeln be

λn ≡ nµ, n ≥ 1 . (1.1)

Let Qn(t) be the number of customers in the system at time t in model n, By the observationabove, Qn(∞) has a Poisson distribution with mean E[Qn(∞)] = λn/µ = n. Since the Poisson

1

distribution approaches the normal distribution as the mean increases, we know that

Qn(∞)− n√n

⇒ N(0, 1) as n →∞ , (1.2)

where N(0, 1) denotes a standard normal random variables (with mean 0 and variance 1).However, we want to establish a limit for the entire stochastic process Qn(t) : t ≥ 0 as

n →∞. For that purpose, we consider the scaled processes Xn ≡ Xn(t) : t ≥ 0 defined by

Xn(t) ≡ Qn(t)− n√n

, t ≥ 0 . (1.3)

To establish the stochastic-process limit in (1.3), we have to be careful about the initialconditions. We will assume that Xn(0) ⇒ X(0) as n → ∞. In addition, we assume that therandom initial number of busy servers, Qn(0), is independent of the arrival process and theservice times. Since the service-time distribution is exponential, the remaining service times ofthose customers initially in service have independent exponential distributions because of thelack-of-memory property of the exponential distribution.

The heavy-traffic limit theorem asserts that the sequence of stochastic processes Xn :n ≥ 1 converges in distribution in the function space D ≡ D([0,∞),R) – to the Ornstein-Uhlenbeck (OU) diffusion process as n →∞, provided that appropriate initial conditions are inforce; see Billingsley (1968, 1999) and Whitt (2002) for background on such stochastic-processlimits.

Here is the theorem for the basic M/M/∞ model:

Theorem 1.1. (heavy-traffic limit in D for the M/M/∞ model) Consider the sequence ofM/M/∞ models defined above. If

Xn(0) ⇒ X(0) in R as n →∞ , (1.4)

thenXn ⇒ X in D as n →∞ , (1.5)

where X is the OU diffusion process with infinitesimal mean m(x) = −µx and infinitesimalvariance σ2(x) = 2µ. Alternatively, X satisfies the stochastic integral equation

X(t) = X(0) +√

2µB(t)− µ

∫ t

0X(s) ds , t ≥ 0 , (1.6)

where B ≡ B(t) : t ≥ 0 is a standard Brownian motion. Equivalently, X satisfies thestochastic differential equation (SDE)

dX(t) = −µX(t)dt +√

2µdB(t), t ≥ 0 . (1.7)

Much of this paper is devoted to three proofs of Theorem 1.1. It is possible to base theproof on the martingale functional central limit theorem (FCLT), as given in §7.1 of Ethier andKurtz (1986) and here in §8, as we will show in §9.2, but it is not necessary to do so. Instead,it can be based on the classic FCLT for the Poisson process, which is an easy consequence ofDonsker’s FCLT for sums of i.i.d. random variables, and the continuous mapping theorem.Nevertheless, the martingale structure can still play an important role. With that approach,the martingale structure can be used to establish stochastic boundedness of the scaled queueingprocesses, which we show implies required fluid limits or functional weak laws of large numbers(FWLLN’s) for random-time-change stochastic processes, needed for an application of the

2

continuous mapping theorem with the composition map. Alternatively (the third method ofproof), the fluid limit can be established by combining the same continuous mapping with thestrong law of large numbers (SLLN) for the Poisson process.

It is also necessary to understand the characterization of the limiting diffusion process via(1.6) and (1.7). That is aided by the general theory of stochastic integration, which can beconsidered part of the martingale theory; e.g., see Ethier and Kurtz (1986) and Rogers andWilliams (1987).

1.2. The QED Many-Server Heavy-Traffic Limiting Regime

We will also establish many-server heavy-traffic limits for Markovian models with finitelymany servers, where the number n of servers goes to infinity along with the arrival rate in thelimit. We will consider the sequence of models in the quality-and-efficiency (QED) many-serverheavy-traffic limiting regime, which is defined by the condition

nµ− λn√n

→ βµ as n →∞ . (1.8)

This limit in which the arrival rate and number of servers increase together according to (1.8)is just the right way so that the probability of delay converges to a nondegenerate limit (strictlybetween 0 and 1), see Halfin and Whitt (1981).

We will also allow finite waiting rooms of size mn, where the waiting rooms grow at rate√n as n →∞, i.e., so that

mn√n→ κ ≥ 0 as n →∞ . (1.9)

With the spatial scaling by√

n, as in (1.3), this scaling in (1.9) is just right to produce areflecting upper barrier at κ in the limit process.

In addition, we allow Markovian abandonment, with each waiting customer abandoningat rate θ. We let the individual service rate µ and individual abandonment rate θ be fixed,independent of n. These modifications produce a sequence of M/M/n/mn + M models (with+M indicating abandonment). The special case of the Erlang A (abandonment), B (blockingor loss) and C (delay) models are obtained, respectively, by (i) letting mn = ∞, (ii) lettingmn = 0, in which case the +M plays no role, and (iii) letting mn = ∞ and θ = 0. So all thebasic many-server Markovian queueing models are covered.

Here is the corresponding theorem for the M/M/n/mn + M model.

Theorem 1.2. (heavy-traffic limit in D for the M/M/n/mn+M model) Consider the sequenceof M/M/n/mn + M models defined above, with the scaling in (1.8) and (1.9). Let Xn be asdefined in (1.3). If

Xn(0) ⇒ X(0) in R as n →∞ , (1.10)


where X is the diffusion process with infinitesimal mean m(x) = −βµ − µx for x < 0 andm(x) = −βµ − θx for x > 0, infinitesimal variance σ2(x) = 2µ and reflecting upper barrierat κ. Alternatively, the limit process X is the unique (−∞, κ]-valued process satisfying thestochastic integral equation

X(t) = X(0)− βµt +√

2µB(t)−∫ t

0[µ(X(s) ∧ 0) + θ(X(s) ∨ 0)] ds− U(t) , t ≥ 0 , (1.12)

3

where B ≡ B(t) : t ≥ 0 is a standard Brownian motion and U is the unique nondecreasingnonnegative process in D such that (1.12) holds and

∫ ∞

01X(t−)<κ dU(t) = 0 . (1.13)

1.3. Literature Review

The specific M/M/∞ result in Theorem 1.1 was first established by Iglehart (1965); seeBorovkov (1967, 1984), Whitt (1982), Glynn and Whitt (1991), Massey and Whitt (1993),Mandelbaum and Pats (1995, 1998), Mandelbaum, Massey and Reiman (1998) and Krichaginaand Puhalskii (1998) for further discussion and extensions. Iglehart applied a different argu-ment; in particular, he applied Stone’s (1963) theorem, which shows that it suffices to have theinfinitesimal means and infinitesimal variances converge, plus other regularity conditions; seeWhitt (2004) for a recent use of that same technique. Iglehart directly assumed finitely manyservers and lets the number of servers increase rapidly with the arrival rate; the number ofservers increases so rapidly that it is tantamount to having infinitely many servers. The caseof infinitely many servers can be treated by a minor modification of the same argument.

The corresponding QED finite-server result in Theorem 1.2, in which the arrival rate andnumber of servers increase together according to (1.8), which in just the right way so thatthe probability of delay converges to a nondegenerate limit (strictly between 0 and 1), wasconsidered by Halfin and Whitt (1981) for the case of infinite waiting room and no abandon-ment. There have been many subsequent results; e.g., see Mandelbaum and Pats (1995, 1998),Srikant and Whitt (1996), Mandelbaum et al. (1998), Puhalskii and Reiman (2000), Garnettet al. (2002), Borst et al. (2004), Jelenkovic et al. (2004), Whitt (2005), Mandelbaum andZeltyn (2005) and Reed (2005) for further discussion and extensions.

Puhalskii and Reiman (2000) apply the martingale argument to establish heavy-trafficlimits in the QED regime. They establish many-server heavy-traffic limits for the GI/PH/n/∞model, having renewal arrival process (the GI), phase-type service-time distributions (the PH),n servers, unlimited waiting space, and the first-come first-served (FCFS) service discipline.They focus on the number of customers in each phase of service, which leads to convergenceto a multi-dimensional diffusion process. One of our three proofs is essentially their argument.

Whitt (2005) applied the same martingale argument to treat the G/M/n/mn + M model,having a general stationary point process for an arrival process (the G), i.i.d. exponentialservice times, n servers, mn waiting spaces, the FCFS service discipline and customer aban-donment with i.i.d exponential times to abandon; see Theorem 3.1 in Whitt (2005). Whitt(2005) is primarily devoted to a heavy-traffic limit for the G/H∗

2/n/mn model, having specialH∗

2 service-time distributions (a mixture of an exponential and a point mass at 0) by a differ-ent argument, but there is a short treatment of the G/M/n/mn + M model by the martingaleargument, following Puhalskii and Reiman (2000) quite closely. The martingale proof is brieflyoutlined in §5 there. The extension to general arrival processes is perhaps the most interestingcontribution there, but that generalization can also be achieved in other ways.

For our proof without martingales, we follow Mandelbaum and Pats (1995), but withoutusing strong approximations as they do. Subsequent related work has established similarasymptotic results for Markovian service networks with multiple stations, each with multipleservers, and possibly multiple customer classes; e.g., see Armony (2005), Armony et al. (2006a,b), Atar (2005), Atar et al. (2004a,b), Dai and Tezcan (2005), Gurvich and Whitt (2006),Harrison and Zeevi (2004), and Tezcan (2006).

For the martingale functional central limit theorem, we draw on Ethier and Kurtz (1986)and Jacod and Shiryayev (1987). An important role is played by simple “one-dimensional”

4

criteria for tightness of stochastic processes in D, going beyond the criteria in Billingsley (1968),reviewed in Whitt (2002). The alternative one-dimensional criteria come from Billingsley (1974,1999), Kurtz (1975, 1981), Aldous (1978, 1989), Rebolledo (1980) and Jacod et al. (1983).

1.4. Organization

The rest of this paper is organized as follows: We start in §2 by constructing the Markovianstochastic process Q(t) : t ≥ 0 representing the number of customers in the system in termsof rate-1 Poisson processes. We do this in two ways: (i) through random time changes and(ii) through random thinnings. Then in §3 we construct associated martingale representations.We justify these constructions by applying, respectively: (i) optional sampling theorems wherethe random time changes are the stopping times and (ii) the integration theorem associatedwith random thinning of a counting process. The two resulting integral representations arevery similar. These integral representations are summarized in Theorems 3.4 and 3.5. In §3we also present a third martingale representation, based on constructing counting processesassociated with a continuous-time Markov chain, exploiting the infinitesimal generator.

Section 4 is devoted to the main steps of the proof of Theorem 1.1. In §4.1 we showthat the integral representation has a unique solution and constitutes a continuous functionmapping R × D into D. In order to establish measurability, we establish continuity in theSkorohod J1 topology as well as the topology of uniform convergence over bounded intervals.As a consequence of the continuity, it suffices to prove FCLT’s for the scaled martingales inthese integral representations. For the first martingale representation, the scaled martingalesthemselves are random time changes of the scaled Poisson process. In §4.2 we show that aFCLT for the martingales based on this first representation, and thus the scaled queueingprocesses, can be obtained by applying the classical FCLT for the Poisson process and thecontinuous mapping theorem with the composition map.

To carry out the continuous-mapping argument above, we also need to establish the re-quired fluid limit or functional weak law of large numbers (FWLLN) for the random timechanges, needed in the application of the CMT with the composition map. In §4.3, followingMandelbaum and Pats (1995), we show that this fluid-limit step can be established by applyingthe strong law of large numbers (SLLN) for the Poisson process with the same continuous map-ping determined by the integral representation. If we do not use martingales, then we observethat it is easy to extend the stochastic -process limit to general arrival processes satisfying aFCLT.

But we can also use martingales to establish the fluid limit. The martingale structure canbe exploited via the Lenglart-Rebolledo inequality to prove stochastic boundedness, first forthe scaled martingales and then for the scaled queueing processes, which in turn can be usedto to establish fluid limits or functional weak laws of large numbers (FWLLN’s) for the scaledrandom time changes.

Since this martingale proof of the fluid limit relies on stochastic boundedness, which isrelated to tightness, we present background on these two important concepts in §5. Thatsection constitutes a significant digression, providing much more background than needed forthe proof of Theorem 1.1. We review the simple one-dimensional criteria for tightness in §5.3.For the proof of Theorem 1.1, we conclude there that it suffices to show that the predictablequadratic variations of the square-integrable martingales are stochastically bounded in R inorder to have the sequence of scaled queueing processes Xn be stochastically bounded in D.

We complete the proof of Theorem 1.1 in §6. In Lemma 5.10 and §6.1 we show thatstochastic boundedness of Xn in D implies the desired fluid limit or FWLLN needed forthe scaled random-time-change processes needed in an application of the continuous mapping

5

with the composition function. In §6.2 we complete the proof by showing that the predictablequadratic variation processes of the martingales are indeed stochastically bounded in R. In§6.3 we show that it is possible to remove a moment condition imposed on Qn(0), the initialrandom number of customers in the system, in the martingale representation; in particular,we show that it is not necessary to assume that E[Qn(0)] < ∞.

In §7 we discuss stochastic-process limits for other queueing models. In §7.1 we present amartingale proof of the corresponding many-server heavy-traffic limit for the M/M/n/∞+Mmodel. Corresponding results hold for the M/M/n/∞model (without customer abandonment)by setting the abandonment rate to zero. The proof is nearly the same as for the M/M/∞model. The only significant difference is the use of the optional sampling theorem for multiplestopping times with respect to multiparameter martingales, as in Kurtz (1980), §§2.8 and 6.2of Ethier and Kurtz (1986) and §12 of Mandelbaum and Pats (1998). We discuss the extensionto cover finite waiting rooms in §7.2 and non-Markovian arrival processes in §7.3.

We state a version of the martingale FCLT from p 339 of Ethier and Kurtz (1986) in §8. Wealso review the proof there, elaborating on the proof in Ethier and Kurtz (1986) and presentingalternative arguments, primarily based on Jacod and Shiryayev (1987).

In §9 we show how the martingale FCLT implies both the FCLT for a Poisson processand the required FCLT for the scaled martingales arising in the second and third martingalerepresentations. We make a few concluding remarks in §10.

2. Sample-Path Constructions

We start by making direct sample-path constructions of the stochastic process Q(t) :t ≥ 0 representing the number of customers in the system in terms of independent Poissonprocesses. We show how this can be done in two different ways.

2.1. Random Time Change of Unit-Rate Poisson Processes

We first represent arrivals and departures as random time changes of independent unit-ratePoisson processes. For that purpose, let A ≡ A(t) : t ≥ 0 and S ≡ S(t) : t ≥ 0 be twoindependent Poisson processes, each with rate (intensity) 1. We use the process A to generatearrivals and the process S to generate service completions, and thus departures. Let the initialnumber of busy servers be Q(0). We assume that Q(0) is a proper random variable independentof the two Poisson processes A and S.

The arrival process is simple. We obtain the originally-specified arrival process with rateλ by simply scaling time in the rate-1 Poisson process A; i.e., we use Aλ(t) ≡ A(λt) for t ≥ 0.It is elementary to see that the stochastic process Aλ is a Poisson process with rate λ.

The treatment of service completions is more complicated. Let D(t) be the number ofdepartures (service completions) in the interval [0, t]. We construct this process in terms of Sby setting

D(t) ≡ S

(µ

∫ t

0Q(s) ds

), t ≥ 0 , (2.1)

but formula (2.1) is more complicated than it looks. The complication is that Q(t) appearingas part of the argument inside S necessarily depends on the history Q(s) : 0 ≤ s < t, whichin turn depends on the history of S, in particular, upon S (

µ∫ s0 Q(u) du

): 0 ≤ s < t. Hence

formula (2.1) is recursive; we must show that it is well defined.Of course, the idea is not so complicated: Formula (2.1) is a consequence of the fact that

the intensity of departures at time s is µQ(s), where the number Q(s) of busy servers at times is multiplied by the individual service rate µ. The function µ

∫ t0 Q(s) ds appearing as an

6

argument inside S serves as a random time change; see §II.6 of Bremaud (1981) and Chapter6 of Ethier and Kurtz (1986).

By the simple conservation of flow, which expresses the content at time t as initial contentplus flow in minus flow out, we have the basic expression

Q(t) ≡ Q(0) + A(λt)−D(t), t ≥ 0 ,

= Q(0) + A(λt)− S

(µ

∫ t

0Q(s) ds

), t ≥ 0 . (2.2)

Lemma 2.1. (construction) The stochastic process Q(t) : t ≥ 0 is well defined as a randomelement of the function space D by formula (2.2). Moreover, it is a birth-and-death stochasticprocess with constant birth rate λk = λ and linear state-dependent death rate µk = kµ.

Proof. To construct a bonafide random element of D, start by conditioning upon the randomvariable Q(0) and the two Poisson processes A and S. Then, with these sample paths specified,we recursively construct the sample path of the stochastic process Q(t) : t ≥ 0. By inductionover the sucessive jump times of these Poisson processes, we show that the sample paths ofQ(t) : t ≥ 0 are right-continuous piecewise-constant real-valued functions of t. Since thePoisson processes A and S have only finitely many transitions in any finite interval w.p.1,the same is necessarily true of the constructed process Q(t) : t ≥ 0. This sample-pathconstruction follows Theorem 4.1 in Chapter 8 on p. 327 of Ethier and Kurtz (1986); theargument also appears in Theorem 9.2 of Mandelbaum et al. (1998). Finally, we can directlyverify that the stochastic process Q(t) : t ≥ 0 satisfies the differential definition of a birth-and-death stochastic process. Let Ft represent the history of the system up to time t. That isthe sigma field generated by Q(s) : 0 ≤ s ≤ t. It is then straightforward that the infinitesimaltransition rates are as claimed:

P (Q(t + h)−Q(t) = +1|Q(t) = k,Ft) = λh + o(h) as h → 0 ,

P (Q(t + h)−Q(t) = −1|Q(t) = k,Ft) = kµh + o(h) as h → 0 ,

P (Q(t + h)−Q(t) = 0|Q(t) = k,Ft) = 1− λh− kµh + o(h) as h → 0 , (2.3)

for each k ≥ 0, where o(h) denotes a function f such that f(h)/h → 0 as h ↓ 0. Ethier andKurtz (1986) approach this last distributional step by verifying that the uniquely constructedprocess is the solution of the local-martingale problem for the generator of the Markov process,which in our case is the birth-and-death process.

Some further explanation is perhaps helpful. Our construction above is consistent withthe construction of the queue-length process as a Markov process, specifically, a birth-and-death process. There are other possible constructions. A different construction would bethe standard approach in discrete-event simulation, with event clocks. Upon the arrival ofeach customer, we might schedule the arrival time of the next customer, by generating anexponential interarrival time. We might also generate the required service time of the currentarrival. With the simulation approach, we have information about the future state that we donot have with the Markov-process construction. The critical distinction between the differentconstructions involves the information available at each time. The information available iscaptured by the filtration, which we discuss with the martingale representations in the nextsection.

The random-time-change approach we have used here is natural when applying strongapproximations, as was done by Mandelbaum, Massey and Reiman (1998) and Mandelbaumand Pats (1995, 1998). They applied the strong approximation for a Poisson process, as didKurtz (1978). A different use of strong approximations to establish heavy-traffic limits fornon-Markovian infinite-server models is contained in Glynn and Whitt (1991).

7

2.2. Random Thinning of Poisson Proceses

We now present an alternative construction, which follows §II.5 of Bremaud (1981) andPuhalskii and Reiman (2000). For this construction, let Aλ and Sµ,k for k ≥ 1 be independentPoisson processes with rate λ and µ, respectively. As before, we assume that these Poissonprocesses are independent of the initial number of busy servers, Q(0). The remaining servicetimes of these initial customers will be generated by S. This will not alter the overall systembehavior, because the service-time distribution is exponential. By the lack-of-memory property,the remaining service times are distributed as i.i.d. exponential random variables, independentof the elapsed service times at time 0.

We let all arrivals be generated from arrival process Aλ; we let service completions fromindividual servers be generated from the processes Sµ,k. To be specific, let the servers benumbered. With that numbering, at any time we use Sµ,k to generate service completionsfrom the busy server with the kth smallest index among all busy servers. (We do not fixattention on a particular server, because we want the initial indices to refer to the busy serversat all time.)

Instead of (2.1), we define the departure process by

D(t) ≡∞∑

k=1

∫ t

01Q(s−)≥k dSµ,k(s), t ≥ 0 , (2.4)

where 1A is the indicator function of the event A; i.e., 1A(ω) = 1 of ω ∈ A and 1A(ω) = 0otherwise. It is important that we use the left limit Q(s−) in the integrand of (2.4), so that theintensity of any service completion does not depend on that service completion itself; i.e., thefunctions Q(s−) and 1Q(s−)=k are left-continuous in s for each sample point. Then, insteadof (2.2), we have

Q(t) ≡ Q(0) + A(λt)−D(t), t ≥ 0 ,

= Q(0) + A(λt)−∞∑

k=1

∫ t

01Q(s−)≥k dSµ,k(s), t ≥ 0 . (2.5)

With this alternative construction, there is an analog of Lemma 2.1 proved in essentially thesame way. In this setting, it is even more evident that we can construct a sample path of thestochastic process Q(t) : t ≥ 0 by first conditioning on a realization of Q(0), Aλ(t) : t ≥ 0and Sµ,k(t) : t ≥ 0, k ≥ 1.

Even though this construction is different that the one in §2.1, it too is consistent with theMarkov-process view. Consistent with most applications, we know what has happened up totime t, but not future arrival times and service times.

3. Martingale Representations

For each of the two sample-path constructions in the previous section, we have associatedmartingale representations. At a high level, it is easy to summarize how to treat the twosample-path constructions, drawing only on the first two chapters of Bremaud (1981). Werepresent the random time changes as stopping times with respect to appropriate filtrationsand apply versions of the optional sampling theorem, which states that random timechanges of martingales are again martingales, under appropriate regularity conditions; e.g.,see Theorem T2 on p. 7 of Bremaud (1981). The problems we consider tend to producemultiple random time changes, making it desirable to apply the optional sampling theorem for

8

multiparameter random time changes, as in §§2.8 and 6.2 of Ethier and Kurtz (1986),but we postpone discussion of that more involved approach until Section 7.1.

For the random thinning, we instead apply the integration theorem for martingalesassociated with counting processes, as in Theorem T8 on p. 27 of Bremaud (1981), whichconcludes that integrals of predictable processes with respect to martingales associated withcounting processes produce new martingales, under regularity conditions.

We also present a third martingale representation, based on martingales associated withgenerators of Markov processes. That approach applies nicely here, because the queueingprocesses we consider are birth-and-death processes. Thus we can also apply martingalesassociated with Markov chains, as on pp 5-6, 294 of Bremaud (1981).

3.1. Martingale Basics

In this subsection we present some preliminary material on continuous-time martingales.There is a large body of literature providing background, including the books by: Bremaud(1981), Ethier and Kurtz (1986), Liptser and Shiryayev (1986), Jacod and Shiryayev (1987),Rogers and Williams (1987, 2000), Karatzas and Shreve (1988), Kallenberg (2002) and Protter(2005). The early book by Bremaud (1981) remains especially useful because of its focus onstochastic point processes and queueing models. More recent lecture notes by Kurtz (2001)and van der Vaart (2006) are very helpful as well.

Stochastic integrals play a prominent role, but we only need relatively simple cases. Since wewill be considering martingales associated with counting processes, we rely on the elementarytheory for finite-variation processes, as in §IV.3 of Rogers and Williams (1987). In that setting,the stochastic integrals reduce to ordinary Stieltjes integrals and we exploit integration by parts.

On the other hand, as can be seen from Theorems 1.1 and 1.2, the limiting stochasticprocesses involve stochastic integrals with respect to Brownian motion, which is substantiallymore complicated. However, that still leaves us well within the classical theory. We can applythe Ito calculus as in Chapter IV of Rogers and Williams (1987), without having to draw uponthe advanced theory in Chapter VI.

For the stochastic-process limits in the martingale setting, it is natural to turn to Ethierand Kurtz (1986), Liptser and Shiryayev (1986) and Jacod and Shiryayev (1987). We primarilyrely on Theorem 7.1 on p. 339 of Ethier and Kurtz (1986). The Shiryayev books are thorough;Liptser and Shiryayev (1986) focuses on basic martingale theory, while Jacod and Shiryayev(1987) focuses on stochastic-process limits.

We will start by imposing regularity conditions. We assume that all stochastic processesX ≡ X(t) : t ≥ 0 under consideration are measurable maps from an underlying probabilityspace (Ω,F , P ) to the function space D ≡ D([0,∞),R) endowed with the standard SkorohodJ1 topology and the associated Borel σ-field (generated by the open subsets), which coincideswith the usual σ-field generated by the coordinate projection maps; see §§3.3 and 11.5 of Whitt(2002).

Since we will be working with martingales, a prominent role is played by the filtration(histories, family of σ fields) F ≡ Ft : t ≥ 0 defined on the underlying probability space(Ω,F , P ). (We have the containments: Ft1 ⊆ Ft2 ⊆ F for all t1 < t2.) As is customary, see p.1 of Liptser and Shiryayev, we assume that all filtrations satisfy the usual conditions:

(i) they are right-continuous:

Ft =⋂

u:u>t

Fu for all t, 0 ≤ t < ∞ , and

(ii) complete (F0, and thus Ft contains all P -null sets of F).

9

We will assume that any stochastic process X under consideration is adapted to the fil-tration; i.e., X is F-adapted, which means that X(t) is Ft-measurable for each t. These regu-larity conditions guarantee desirable measurability properties, such as progressive measurabil-ity: The stochastic process X is progressively measurable with respect to the filtration F if, foreach t ≥ 0 and Borel measurable subset A of R, the set (s, ω) : 0 ≤ s ≤ t, ω ∈ Ω, X(s, ω) ∈ Abelongs to the product σ field B([0, t]) × Ft, where B([0, t]) is the usual Borel σ field on theinterval [0, t]. See p. 5 of Karatzas and Shreve (1986) and Section 1.1 of Liptser and Shiryayev(1986). In turn, progressive measurability implies measurability, regarding X(t) as a map fromthe product space Ω× [0,∞) to R.

A stochastic process M ≡ M(t) : t ≥ 0 is a a martingale (submartingale) with respectto a filtration F ≡ Ft : t ≥ 0 if M(t) is adapted to Ft and E[M(t)] < ∞ for each t ≥ 0, and

E[M(t + s)|Ft] = (≥)M(t) (3.1)

with probability 1 (w.p.1) with respect to the underlying probability measure P for each t ≥ 0and s > 0.

It is often important to have a stronger property than the finite-moment condition E[M(t)] <∞. A stochastic process X ≡ X(t) : t ≥ 0 is uniformly integrable (UI) if

limn→∞ sup

t≥0E[|X(t)|1|X(t)|>n] = 0 ; (3.2)

see p. 286 of Bremaud (1981) and p. 114 of Rogers and Williams (2000). We remark that UIimplies, but is not implied by

supt≥0

E[|X(t)|] < ∞ . (3.3)

A word of warning is appropriate, because the books are not consistent in their assumptionsabout integrability. A stochastic process X ≡ X(t) : t ≥ 0 may be called integrable ifE[|X(t)|] < ∞ for all t or if the stronger (3.3) holds. This variation occurs with square-integrable, defined below. Similarly, the basic objects may be taken to be martingales, asdefined as above, or might instead be UI martingales, as on p. 20 of Liptser and Shiryayev(1987).

The stronger UI property is used in preservation theorems - theorems implying that stoppedmartingales and stochastic integrals with respect to martingales - remain martingales. In orderto get this property when it is not at first present, the technique of localizing is applied. Welocalize by introducing associated stopped processes, where the stopping is done with stoppingtimes. A nonnegative random variable τ is an F-stopping time if stopping sometime beforet depends only on the history up to time t, i.e., if

τ ≤ t ∈ Ft for all t ≥ 0 . (3.4)

For any class C of stochastic processes, we define the associated local class Cloc as theclass of stochastic processes X(t) : t ≥ 0 for which there exists a sequence of stoppingtimes τn : n ≥ 1 such that τn → ∞ w.p.1 as n → ∞ and the associated stopped processesX(τn∧ t) : t ≥ 0 belong to class C for each n, where a∧b ≡ min a, b. We obtain the class oflocal martingales when C is the class of martingales. Localizing expands the scope, becauseif we start with martingales, then such stopped martingales remain martingales, so that allmartingales are local martingales. Localizing is important, because the stopped processesnot only are martingales but can be taken to be UI martingales; see p. 21 of Liptser andShiryayev (1987) and §IV.12 of Rogers and Williams (1987). The UI property is needed forthe preservation theorems.

10

As is customary, we will also be exploiting predictable stochastic processes, which wewill take to mean having left-continuous sample paths. See p. 8 of Bremaud (1981) and §1.2of Jacod and Shiryaev (1987) for the more general definition and additional discussion. Thegeneral idea is that a stochastic process X ≡ X(t) : t ≥ 0 is predictable if its value at t isdetermined by values at times prior to t. But we can obtain simplification by working withstochastic processes with sample paths in D. In the setting of D, we can have a left-continuousprocess by either (i) considering the left-limit version X(t−) : t ≥ 0 of a stochastic processX in D (with X(0−) ≡ X(0)) or (ii) considering a stochastic process in D with continuoussample paths. Once we restrict attention to stochastic processes with sample paths in D, wedo not need the more general notion, because the left-continuous version is always well defined.If we allowed more general sample paths, that would not be the case.

We will be interested in martingales associated with counting processes, adapted to theappropriate filtration. These counting processes will be nonnegative submartingale processes.Thus we will be applying the following special case of the Doob-Meyer decomposition.

Theorem 3.1. (Doob-Meyer decomposition for nonnegative submartingales) If Y is a sub-martingale with nonnegative sample paths, E[Y (t)] < ∞ for each t, and Y is adapted to afiltration F ≡ Ft, then there exists an F-predictable process A, called the compensator ofY or the dual-predictable projection, such that A has nonnegative nondecreasing sample paths,E[A(t)] < ∞ for each t, and M ≡ Y − A is an F-martingale. The compensator is unique inthe sense that the sample paths of any two versions must be equal w.p.1.

Proof. See §1.4 of Karatzas and Shreve (1988). The DL condition in Karatzas and Shreve(1988) is satisfied because of the assumed nonnegativity; see Definition 4.8 and Problem 4.9on p. 24. For a full account, see §VI.6 of Rogers and Williams (1987).

3.2. Quadratic Variation and Covariation Processes

A central role in the martingale approach to stochastic-process limits is played by thequadratic-variation and quadratic-covariation processes, as can be seen from the martingaleFCLT stated here in §8. That in turn depends on the notion of square-integrability. We saythat a stochastic process X ≡ X(t) : t ≥ 0 is square integrable if E[X(t)2] < ∞ for eacht ≥ 0. We thus say that a martingale M ≡ M(t) : t ≥ 0 (with respect to some filtration)is square integrable if E[M(t)2] < ∞ for each t ≥ 0. Again, to expand the scope, we canlocalize, and focus on the class of locally square integrable martingales. Because we can localizeto get the square-integrability, the condition is not very restrictive.

If M is a square-integrable martingale, then M2 ≡ M(t)2 : t ≥ 0 is necessarily a sub-martingale with nonnegative sample paths, and thus satisfies the conditions of Theorem 3.1.The predictable quadratic variation (PQV) of a square-integrable martingale M , denotedby 〈M〉 ≡ 〈M〉(t) : t ≥ 0 (the angle-bracket process), is the compensator of the sub-martingale M2; i.e., the stochastic process 〈M〉 is the unique nondecreasing nonnegative pre-dictable process such that E[〈M〉(t)] < ∞ for each t and M2−〈M〉 is a martingale with respectto the reference filtration. (Again, uniqueness holds to the extent that any two versions havethe same sample paths w.p.1.) Not only does square integrability extend by localizing, butTheorem 3.1 has a local version; see p. 375 of Rogers and Williams (1987). As a consequencethe PQV is well defined for any locally square-integrable martingale.

Given two locally square-integrable martingales M1 and M2, the predictable quadraticcovariation can be defined as

〈M1,M2〉 ≡ 14

(〈M1 + M2〉 − 〈M1 −M2〉) ; (3.5)

11

see p. 48 of Liptser and Shiryayev (1987). It can be characterized as the unique (up toequality of sample paths w.p.1) nondecreasing nonnegative predictable process such thatE[〈M1,M2〉(t)] < ∞ for each t and M1M2 − 〈M1,M2〉 is a martingale.

We will also be interested in another quadratic variation of a square-integrable martingaleM , the so-called optional quadratic variation [M ] (OQV, the square-bracket process).The square-bracket process is actually more general than the angle-bracket process, becausethe square-bracket process is well defined for any local martingale, as opposed to only all locallysquare-integrable martingales; see §§IV. 18, 26 and VI. 36, 37 of Rogers and Williams (1987).The following is Theorem 37.8 on p. 389 of Rogers and Williams. For a stochastic process M ,let ∆M(t) ≡ M(t)−M(t−), the jump at t, for t ≥ 0.

Theorem 3.2. (optional quadratic covariation for local martingales) Let M1 and M2 be localmartingales with M1(0) = M2(0) = 0. Then there exists a unique process, denoted by [M1,M2],with sample paths of finite variation over bounded intervals and [M1, M2](0) = 0 such that

(i)M1M2 − [M1,M2] is a local martingale(ii)∆[M1,M2] = (∆M1)(∆M2) . (3.6)

For one local martingale M , the optional quadratic variation is then defined as [M ] ≡ [M, M ].Note that, for a locally square-integrable martingale M , both M2−〈M〉 and M2− [M ] are

local martingales, but 〈M〉 is predictable while [M ] is not. Indeed, subtracting, we see that[M ]− 〈M〉 is a local martingale, so that 〈M〉 is the compensator of both M2 and [M ].

There is also an alternative definition of the OQV; see Theorem 5.57 in §5.8 of van derVaart (2006).

Theorem 3.3. (alternative definition) If M1 and M2 are local martingales with M1(0) =M2(0) = 0, then

[M1,M2](t) ≡ limn→∞

∞∑

i=1

(M1(tn,i)−M1(tn,i−1))(M2(tn,i)−M2(tn,i−1)), t ≥ 0 , (3.7)

where tn,i ≡ t∧ (i2−n) and the mode of convergence for the limit as n →∞ is understood to bein probability. The limit is independent of the way that the time points tn,i are selected withinthe interval [0, t], provided that ti,n > tn,i−1 and that the maximum difference ti,n − tn,i−1 forpoints inside the interval [0, t] goes to 0 as n →∞.

Unfortunately, these two quadratic-variation processes 〈M〉 and [M ] associated with alocally square-integrable martingale M , and the more general covariation processes 〈M1,M2〉and [M1,M2], are somewhat elusive, since the definitions are indirect; it remains to exhibit theseprocesses. We will exploit our sample-path constructions in terms of Poisson processes aboveto identify appropriate quadratic variation and covariation processes in following subsections.

Fortunately, however, the story about structure is relatively simple in the two cases ofinterest to us: (i) when the martingale is a compensated counting process, and (ii) when themartingale has continuous sample paths. The story in the second case is easy to tell: WhenM is continuous, 〈M〉 = [M ], and this (predictable and optional) quadratic variation processitself is continuous; see §VI.34 of Rogers and Williams (1987). This case applies to Brownianmotion and our limit processes. For standard Brownian motion, 〈M〉(t) = [M ](t) = t, t ≥ 0.

12

3.3. Counting Processes

The martingales we consider for our pre-limit processes will be compensated countingprocesses. By a counting process (or point process), we mean a stochastic process N ≡N(t) : t ≥ 0 with nondecreasing nonnegative-integer-valued sample paths in D and N(0) = 0.We say that N is a unit-jump counting process if all jumps are of size 1. We say thatN is non-explosive if N(t) < ∞ w.p.1 for each t < ∞. Equivalently, if Tn : n ≥ 1 is theassociated sequence of points, where

N(t) ≡ max n ≥ 0 : Tn ≤ t, t ≥ 0 ,

with T0 ≡ 0, then N is a unit-jump counting process if Tn+1 > Tn for all n ≥ 0; while N isnon-explosive if Tn →∞ w.p.1 as n →∞; see p. 18 of Bremaud (1981).

As discussed by Bremaud, the compensator of a non-explosive unit-jump counting processis typically a stochastic process with sample paths that are absolutely continuous with respectto Lebesgue measure, so that the compensator A can represented as an integral

A(t) =∫ t

0X(s) ds, t ≥ 0 ,

where X ≡ X(t) : t ≥ 0 is adapted to the filtration F. When the compensator has such anintegral representation, the integrand X is called the stochastic intensity of the countingprocess N .

We will apply the following extension of Theorem 3.1.

Lemma 3.1. (PQV for unit-jump counting processes) If N is a non-explosive unit-jumpcounting process adapted to F with E[N(t)] < ∞ for all t, and if the compensator A of Nprovided by Theorem 3.1 is continuous, then the martingale M ≡ N −A is a square-integrablemartingale with respect to F with quadratic variation processes:

〈M〉 = A and [M ] = N . (3.8)

We first observe that the conditions of Lemma 3.1 imply that N is a nonnegative F-submartingale,so that we can apply Theorem 3.1. We use the extra conditions to get more. We prove Lemma3.1 in the Appendix.

3.4. The First Martingale Representation

We start with the first sample-path construction in §2.1. As a regularity condition, herewe assume that

E[Q(0)] < ∞ . (3.9)

We will show how to remove that condition later in §6.3. It could also be removed immediatelyif we chose to localize.

Here is a quick summary of how martingales enter the picture: The Poissonprocesses A(t) and S(t) underlying the first representation of the queueing model in §2.1 as wellas the new processes A(λt) and S

(µ

∫ t0 Q(s) ds

)there have nondecreasing nonnegative sample

paths. Consequently, they are submartingales with respect to appropriate filtrations (histories,families of σ-fields). Thus, by subtracting the compensators, we obtain martingales. Thenthe martingale M so constructed turns out to be square integrable, admitting a martingalerepresentation M2 − 〈M〉, where 〈M〉 is the predictable quadratic variation, which in ourcontext will coincide with the compensator.

13

In constructing this representation, we want to be careful about the filtration (histories,family of σ fields). Here we will want to use the filtration F ≡ Ft : t ≥ 0 defined by

Ft ≡ σ

(Q(0), A(λs), S

(µ

∫ s

0Q(u) du

): 0 ≤ s ≤ t

), t ≥ 0 , (3.10)

augmented by including all null sets.The following processes will be proved to be F-martingales:

M1(t) ≡ A(λt)− λt,

M2(t) ≡ S

(µ

∫ t

0Q(s) ds

)− µ

∫ t

0Q(s) ds, t ≥ 0 , (3.11)

where here A refers to the arrival process. Hence, instead of (2.2), we have the alternatemartingale representation

Q(t) = Q(0) + M1(t)−M2(t) + λt− µ

∫ t

0Q(s) ds , t ≥ 0 . (3.12)

In applications of Theorem 3.1 and Lemma 3.1, it remains to show that the conditions aresatisfied and to identify the compensator. The following lemma fills in that step for a randomtime change of a rate-1 Poisson process by applying the optional sampling theorem. At thisstep, it is natural, as in §12 of Mandelbaum and Pats (1998), to apply the optional samplingtheorem for martingales indexed by directed sets (Theorem 2.8.7 on p. 87 of Ethier andKurtz (1986)) associated with multiparameter random time changes (§6.2 on p. 311 of Ethierand Kurtz), but here we can use a more elementary approach. For supporting theory at thispoint, see Theorem 17.24 and Proposition 7.9 of Kallenberg (2002). We use the mutiparameterrandom time change in §7.1.

Let be the composition map applied to functions, i.e., (x y)(t) ≡ x(y(t)).

Lemma 3.2. (random time change of a rate-1 Poisson process) Suppose that S is a rate-1Poisson process adapted to a filtration F ≡ Ft : t ≥ 0 and I ≡ I(t) : t ≥ 0 is a stochasticprocess with continuous nondecreasing nonnegative sample paths, where I(t) is an F-stoppingtime for each t ≥ 0. In addition, suppose that the following moment conditions hold:

E[I(t)] < ∞ and E[S(I(t))] < ∞ for all t ≥ 0. (3.13)

Then S I ≡ S(I(t)) : t ≥ 0 is a non-explosive unit-jump counting process such thatM ≡ S I − I ≡ S(I(t)) − I(t) : t ≥ 0 is a square-integrable martingale with respect to thefiltration FI ≡ FI(t) : t ≥ 0, having quadratic variation processes

〈M〉(t) = I(t) and [M ](t) = S(I(t)), t ≥ 0 . (3.14)

Proof. Since the sample paths of I are continuous, it is evident that S I is a unit-jumpcounting process. In order to apply the optional sampling theorem, we now localize by letting

Im(t) ≡ I(t) ∧m, t ≥ 0 . (3.15)

Since I(t) is an F-stopping time, Im(t) is a bounded F-stopping time for each m ≥ 1. Theoptional sampling theorem then implies that Mm ≡ S Im − Im ≡ S(Im(t)− Im(t) : t ≥ 0is an FI -martingale; e.g., see p. 7 of Bremaud (1981) or p. 61 of Ethier and Kurtz (1986).As a consequence Im is the compensator of S Im. Since we have the moment conditions in

14

(3.13), we can let m ↑ ∞ and apply the monotone convergence theorem with conditioning, ason p. 280 of Bremaud, to deduce that M ≡ S I− I ≡ S(I(t))− I(t) : t ≥ 0 is a martingale.Specifically, given that

E[S(Im(t + s))− Im(t + s)|FI(s)] = S(Im(s))− Im(s) w.p.1

for all m,

E[S(Im(t + s))|FI(s)] → E[S(I(t + s))|FI(s)] w.p.1 as m →∞ ,

E[Im(t + s)|FI(s)] → E[I(t + s)|FI(s)] w.p.1 as m →∞ ,

S(Im(s))) ↑ S(I(s)) and Im(s) ↑ I(s) as m →∞, we have

E[S(I(t + s))− I(t + s)|FI(s)] = S(I(s))− I(s) w.p.1

Lemma 3.1 implies the square-integrability and identifies the quadratic variation processes.To apply Lemma 3.2 to our M/M/∞ queueing problem, we need to verify the finite-moment

conditions in (3.13). For that purpose, we use a crude inequality:

Lemma 3.3. (crude inequality) Given the representation (2.2),

Q(t) ≤ Q(0) + A(λt), t ≥ 0 , (3.16)

so that ∫ t

0Q(s) ds ≤ t(Q(0) + A(λt)), t ≥ 0 . (3.17)

Now we want to show that our processes in (3.11) and (3.12) actually satisfy the conditionsof Lemma 3.2 with respect to the filtration in (3.10). However, to do the proof below we willfirst alter the filtration.

Lemma 3.4. (identifying the compensator and verifying the conditions.) If E[Q(0)] < ∞ inthe setting of §2.1 with Q(t) : t ≥ 0 defined in (2.2),

I(t) ≡ µ

∫ t

0Q(s) ds, t ≥ 0 , (3.18)

and the filtration F in (3.10), then the conditions of Lemma 3.2 are satisfied, so that S I − Iis a square-integrable F-martingale with F-compensator I in (3.18).

Proof. First, we can apply the crude inequality in (3.17) to establish the required momentconditions: Since E[Q(0)] < ∞,

E

[µ

∫ t

0Q(s) ds

]≤ µt(E[Q(0)] + E[A(λt)]) = µtE[Q(0)] + µλt2 < ∞, t ≥ 0 , (3.19)

and

S

(µ

∫ t

0Q(s) ds

)≤ S (µt(Q(0) + A(λt)) , t ≥ 0 , (3.20)

so that

E

[S

(µ

∫ t

0Q(s) ds

)]≤ E [S (µt(Q(0) + A(λt)))] ,

= E [E [S (µt(Q(0) + A(λt))) |Q(0) + A(λt)]] , t ≥ 0 ,

= µt (E[Q(0)] + λt) < ∞, t ≥ 0 . (3.21)

15

In order to focus on the service completions in the easiest way, we initially condition onthe entire arrival process, and consider the filtration F1 ≡ F1

t : t ≥ 0 defined by

F1t ≡ σ (Q(0), A(u) : u ≥ 0, S (s) : 0 ≤ s ≤ t) , t ≥ 0 , (3.22)

augmented by including all null sets. Then, by virtue of (2.2) and the recursive constructionin Lemma 2.1, for each t ≥ 0, I(t) in (3.18) is a stopping time relative to F1

x for all x ≥ 0, i.e.,

I(t) ≤ x ∈ F1x for all x ≥ 0 and t ≥ 0 .

(This step is a bit tricky: To know I(t), we need to know Q(s), 0 ≤ s ≤ t, but, by (2.2), thatdepends on I(s), 0 ≤ s ≤ t. Hence, to know whether or not I(t) ≤ x holds, it suffices toknow S(u) : 0 ≤ u ≤ x.) Since S(t) − t : t ≥ 0 is a martingale with respect to F1 andthe moment conditions in (3.19) and (3.21) are satisfied, we can apply Lemma 3.2 to deducethat S(I(t)) − I(t) : t ≥ 0 is a square-integrable martingale with respect to the filtrationF2 ≡ F2

t : t ≥ 0 defined by

F2t ≡ F1

I(t) ≡ σ (Q(0), A(u) : u ≥ 0, S (s) : 0 ≤ s ≤ I(t)) , t ≥ 0 , (3.23)

augmented by including all null sets and that I in (3.18) is both the compensator and thepredictable quadratic variation. Finally, since the process representing the arrivals after timet, i.e., the stochastic process A(t + s) − A(t) : s ≥ 0, is independent of Q(s), 0 ≤ s ≤ t, byvirtue of the recursive construction in Lemma 2.1, we can replace the filtration in (3.23) bythe smaller filtration in (3.10). That completes the proof.

We now introduce corresponding processes associated with the sequence of models in-dexed by n. We have

Qn(t) = Qn(0) + M∗n,1(t)−M∗

n,2(t) + λnt− µ

∫ t

0Qn(s) ds , t ≥ 0 . (3.24)

where

M∗n,1(t) ≡ A(λnt)− λnt,

M∗n,2(t) ≡ S

(µ

∫ t

0Qn(s) ds

)− µ

∫ t

0Qn(s) ds, (3.25)

The filtrations change with n in the obvious wayWe now introduce the scaling, just as in (1.3). Let the scaled martingales be

Mn,1(t) ≡M∗

n,1(t)√n

and Mn,2(t) ≡M∗

n,2(t)√n

, t ≥ 0 . (3.26)

Then, from (3.24)–(3.26), we get


=Qn(0)− n√

n+

M∗n,1(t)√

n− M∗

n,2(t)√n

+λnt− nµt√

n− µ

∫ t

0

(Qn(s)− n√

n

)ds ,

=Qn(0)− n√

n+

M∗n,1(t)√

n− M∗

n,2(t)√n

− µ

∫ t

0

(Qn(s)− n√

n

)ds ,

= Xn(0) + Mn,1(t)−Mn,2(t)− µ

∫ t

0Xn(s) ds , t ≥ 0 . (3.27)

Now we summarize this martingale representation for the scaled processes, depending uponthe index n. Here is the implication of the analysis above:

16

Theorem 3.4. (first martingale representation for the scaled processes) If E[Qn(0)] < ∞ foreach n ≥ 1, then the scaled processes Xn in (1.3) have the martingale representation

Xn(t) ≡ Xn(0) + Mn,1(t)−Mn,2(t)− µ

∫ t

0Xn(s) ds , t ≥ 0 , (3.28)

where Mn,i are given in (3.25) and (3.26). These processes Mn,i are square-integrable martin-gales with respect to the filtrations Fn ≡ Fn,t : t ≥ 0 defined by

Fn,t ≡ σ

(Qn(0), A(λns), S

(µ

∫ s

0Qn(u) du

): 0 ≤ s ≤ t

), t ≥ 0 , (3.29)

augmented by including all null sets. Their associated predictable quadratic variations are〈Mn,1〉(t) = λnt/n, t ≥ 0, and

〈Mn,2〉(t) =µ

n

∫ t

0Qn(s) ds, t ≥ 0 , (3.30)

where E[〈Mn,2〉(t)] < ∞ for all t ≥ 0 and n ≥ 1. The associated optional quadratic variationsare [Mn,1](t) = A(λnt)/n, t ≥ 0 and

[Mn,2](t) =S

(µ

∫ t0 Qn(s) ds

)

n, t ≥ 0 . (3.31)

Note that Xn appears on both sides of the integral representation (3.28), but Xn(t) appearson the left, while Xn(s) for 0 ≤ s ≤ t appears on the right. In §4.1 we show how to work withthis integral representation.

3.5. The Second Martingale Representation

We can also start with the second sample-path construction and obtain another integralrepresentation of the form (3.28).

Now we start with the martingales:

MA(t) ≡ Aλ(t)− λt,

MSµ,k(t) ≡ Sµ,k(t)− µt,

MS(t) ≡∞∑

k=1

∫ t

01Q(s−)≥k dSµ,k(s)−

∞∑

k=1

∫ t

0µ1Q(s−)≥k ds,

=∞∑

k=1

∫ t


∫ t

0

∞∑

k=1

µ1Q(s−)≥k ds,

=∞∑

k=1

∫ t

01Q(s−)≥k dSµ,k(s)− µ

∫ t

0Q(s−) ds, (3.32)

so that, instead of (2.2), we have the alternate representation

Q(t) = Q(0) + MA(t)−MS(t) + λt− µ

∫ t

0Q(s−) ds, , t ≥ 0 , (3.33)

where MA and MS are square-integrable martingales with respect to the filtration F ≡ Ft :t ≥ 0 defined by

Ft ≡ σ (Q(0), Aλ(s), Sµ,k(s), k ≥ 1 : 0 ≤ s ≤ t) , t ≥ 0 , (3.34)

17

again augmented by the null sets. Notice that this martingale representation is very similarto the martingale representation in (3.12). The martingales in (3.12) and (3.33) are differentand the filtrations in (3.10) and (3.34) are different, but the predictable quadratic variationsare the same and the form of the integral representation is the same. Thus, there is an analogof Theorem 3.4 in this setting.

We now provide theoretical support for the claims above. First, we put ourselves in thesetting of Lemma 3.1.

Lemma 3.5. (a second integrable counting process with unit jumps) If E[Q(0)] < ∞, then,in addition to being adapted to the filtration Ft in (3.34), the stochastic process Y defined by

Y (t) ≡∞∑

k=1

∫ t

01Q(s−)≥k dSk(s), t ≥ 0, (3.35)

is a unit-jump counting process such that E[Y (t)] < ∞ for all t ≥ 0.

Proof. It is immediate that Y is a counting process, but there is some question about integra-bility and unit jumps, because there are infinitely many Poisson processes under consideration.To establish integrability, we apply the crude inequality (3.16) to get

Y (t) ≤∞∑

k=1

∫ t

01Q(0)+A(λt)≥k dSk(s) ≤

∞∑

k=1

1Q(0)+A(λt)≥kSk(t) , (3.36)

so that

E[Y (t)] ≤∞∑

k=1

P (Q(0) + A(λt) ≥ k)E[Sk(t)] =∞∑

k=1

P (Q(0) + A(λt) ≥ k)µt

≤ µtE[Q(0) + A(λt)] = µt(E[Q(0)] + λt) < ∞ . (3.37)

The crude inequality (3.16) implies, for any t > 0, that the integrand 1Q(s−)≥k is non-zerofor any s, 0 ≤ s ≤ t, for only finitely many k. As a consequence, Y necessarily has unit jumps.For any finite number of independent Poisson processes, a common jump anywhere occurs withprobability 0.

Given Lemmas 3.1 and 3.5, it only remains to identify the compensator of the countingprocess Y , which we call Y since A is used to refer to the arrival process. For that purpose,we can apply the integration theorem, as on p. 10 of Bremaud (1981). But our process Y in(3.35) is actually a sum involving infinitely many Poisson processes, so we need to be careful.

Lemma 3.6. (identifying the compensator of Y ) The compensator of Y in (3.35) is given by

Y (t) ≡∞∑

k=1

∫ t

01Q(s−)≥kµ ds = µ

∫ t

0Q(s−) ds, t ≥ 0 ; (3.38)

i.e., Y − Y is an F-martingale for the filtration in (3.34).

Proof. As indicated above, we can apply the integration theorem on p. 10 of Bremaud, butwe have to be careful because Y involves infinitely many Poisson processes. Hence we firstconsider the first n terms in the sum. With that restriction, since the integrand is an indicatorfunction for each k, we consider the integral of a bounded predictable process with respect to

18

the martingale ∑nk=1(Sµ,k(t) − µt) : t ≥ 0, which is a martingale of “integrable bounded

variation,” as required (and defined) by Bremaud. As a consequence,

MnS (t) ≡

n∑

k=1

∫ t


n∑

k=1

∫ t

0µ1Q(s−)≥k ds, t ≥ 0 , (3.39)

is an F-martingale. However, given that E[Y (t)] < ∞, we can apply the monotone convergencetheorem to each of the two terms in (3.39) in order to take the limit as n →∞ to deduce thatE[Y (t)] < ∞ and MS itself, as defined in (3.32), is an F-martingale, which implies that thecompensator of Y in (3.35) is indeed given by (3.38).

By this route we obtain another integral representation for the scaled processes of exactlythe same form as in Theorem 3.4. As before in (3.24)-(3.27), we introduce the sequence ofmodels indexed by n. The martingales and filtrations are slightly different, but in the end thepredictable quadratic variation processes are essentially the same.

Theorem 3.5. (second martingale representation for the scaled processes) If E[Qn(0)] < ∞for each n ≥ 1, then the scaled processes Xn in (1.3) have the martingale representation

Xn(t) ≡ Xn(0) + Mn,1(t)−Mn,2(t)− µ

∫ t

0Xn(s) ds , t ≥ 0 , (3.40)

where Mn,i are given in (3.26), but instead of (3.25), we have

M∗n,1(t) ≡ Aλn(t)− λnt,

M∗n,2(t) ≡

∞∑

k=1

∫ t

01Qn(s−)≥k dSµ,k(s)− µ

∫ t

0Qn(s−) ds . (3.41)

These processes Mn,i are square-integrable martingales with respect to the filtrations Fn ≡Fn,t : t ≥ 0 defined by

Fn,t ≡ σ (Qn(0), Aλn(s), Sµ,k(s), k ≥ 1 : 0 ≤ s ≤ t) , t ≥ 0 , (3.42)

augmented by including all null sets. Their associated predictable quadratic variations are〈Mn,1〉(t) = λnt/n, t ≥ 0, and

〈Mn,2〉(t) =µ

n

∫ t

0Qn(s) ds, t ≥ 0 , (3.43)

where E[〈Mn,2〉(t)] < ∞ for all t ≥ 0 and n ≥ 1 and 〈Mn,1〉(t) = (λnt/n) → µt as n → ∞ .The associated optional quadratic variations are [Mn,1](t) = Aλn(t)/n, t ≥ 0, and

[Mn,2](t) =

∑∞k=1

∫ t0 1Qn(s−)≥k dSµ,k(s)

n, t ≥ 0 . (3.44)

3.6. The Third Martingale Representation

We can also obtain a martingale representation for the stochastic process Q by exploitingthe fact that the stochastic process Q(t) : t ≥ 0 is a birth-and-death process. We have thebasic representation

Q(t) = Q(0) + A(t)−D(t), t ≥ 0 , (3.45)

where A is the arrival process and D is the departure process, as in (2.2). Since Q is a birth-and-death process, we can apply the Levy and Dynkin formulas, as on p. 294 of Bremaud to

19

obtain martingales associated with various counting processes associated with Q, including thecounting processes A and D. Of course, A is easy, but the Dynkin formula immediately yieldsthe desired martingale for D:

MD(t) ≡ D(t)− µ

∫ t

0Q(s) ds, t ≥ 0 , (3.46)

where the compensator of MD is just as in the first two martingale representation, i.e., as in(3.11), (3.18). (3.32) and (3.38); see pp 6 and 294 of Bremaud. We thus again obtain themartingale representation of the form (3.12) and (3.33). Here, however, the filtration can betaken to be

Ft ≡ σ (Q(s) : 0 ≤ s ≤ t) , t ≥ 0 . (3.47)

The proof of Theorem 1.1 is then the same as for the second representation, which will be byan application of the martingale FCLT in §8.

4. The Main Steps in the Proof of Theorem 1.1

In this section we indicate the main steps in the proof of Theorem 1.1. First, in §4.1 weshow that the integral representation has a unique solution, so that it constitutes a continuousfunction from D to D. Next, in §4.2 we show how the limit can be obtained from the functionalcentral limit theorem for the Poisson process and the continuous mapping theorem, but sometechnical details remain to be verified. In §4.3 we show how the proof can be completed withoutmartingales by directly establishing the associated fluid limit.

4.1. The Integral Representation Is Continuous

We apply the continuous-mapping theorem (CMT) with the integral representationin (3.28) and (3.40) in order to establish the desired convergence; for background on the CMT,see §3.4 of Whitt (2002). In subsequent sections we will show that the scaled martingalesconverge weakly to independent Brownian motions, i.e.,

(Mn,1,Mn,2) ⇒ (√

µB1,√

µB2) in D2 ≡ D ×D as n →∞ , (4.1)

where B1 and B2 are two independent standard Brownian motions, from which an applicationof the CMT with subtraction yields

Mn,1 −Mn,2 ⇒ √µB1 −√µB2

d=√

2µB in D as n →∞ , (4.2)

where B is a single standard Brownian motion.We then apply the CMT with the function f : D ×R→ D taking (y, b) into x determined

by the integral representation

x(t) = b + y(t)− µ

∫ t

0x(s) ds , t ≥ 0 . (4.3)

In the pre-limit, the function y in (4.3) is played by Mn,1−Mn,2 ≡ Mn,1(t)−Mn,2(t) : t ≥ 0in (4.2), while b is played by Xn(0). In the limit, the function y in (4.3) is played by the limit√

2µB in (4.2), while b is played by X(0).For our application, the limiting stochastic process in (4.2) has continuous sample paths.

Moreover, the function f in (4.3) maps continuous functions into continuous functions, aswe show below. Hence, it suffices to show that the map f : D × R → D is measurable

20

and continuous at continuous limits. Since the limit is necessarily continuous as well, therequired continuity follows from continuity when the function space D appearing in boththe domain and the range is endowed with the topology of uniform convergence on boundedintervals. However, if we only establish such continuity, then that leaves open the issue ofmeasurability. It is significant that the σ field on D generated by the topology of uniformconvergence on bounded intervals is not the desired customary σ field on D, which is generatedby the coordinate projections or by any of the Skorohod topologies; see §11.5 of Whitt (2002)and §18 of Billingsley (1968). We prove measurability with respect to the appropriate σ field onD (generated by the J1 topology) by proving continuity when the function space D appearingin both the domain and the range is endowed with the Skorohod J1 topology. That impliesthe required measurabiity. At the same time, of course, it provides continuity in that setting.

We now establish the basic continuity result. We establish a slightly more general formthan needed here in order to be able to treat other cases. In particular, we introduce a Lipschitzfunction h : R→ R; i.e., we assume that there exists a constant c > 0 such that

|h(s1)− h(s2)| ≤ c|s1 − s2| for all s1, s2 ∈ R . (4.4)

We apply the more general form to treat the Erlang A model in §7.1.

Theorem 4.1. (continuity of the integral representation) Consider the integral representation

x(t) = b + y(t) +∫ t

0h(x(s)) ds , t ≥ 0 , (4.5)

where h : R→ R satisfies h(0) = 0 and is a Lipschitz function as defined in (4.4). The integralrepresentation in (4.5) has a unique solution x, so that the integral representation constitutesa function f : D × R → D mapping (y, b) into x ≡ f(y, b). In addition, the function f iscontinuous provided that the function space D (in both the domain and range) is endowed witheither: (i) the topology of uniform convergence over bounded intervals or (ii) the Skorohod J1

topology. Moreover, if y is continuous, then so is x.

Proof. If y is a piecewise-constant function, then we can directly construct the solution x ofthe integral representation by doing an inductive construction, just as in Lemma 2.1. Sinceany element y of D can be represented as the limit of piecewise-constant functions, wherethe convergence is uniform over bounded intervals, using endpoints that are continuity pointsof y, we can then extend the function f to arbitrary elements of D, exploiting continuity inthe topology of uniform convergence over bounded intervals, shown below. Uniqueness followsfrom the fact that the only function x in D satisfying the inequality

|x(t)| ≤ c

∫ t

0|x(s)| ds, t ≥ 0 , (4.6)

is the zero function, which is a consequence of Gronwall’s inequality, which we re-state inLemma 4.1 below in the form needed here.

For the remainder of the proof, we apply Gronwall’s inequality again. We introduce thenorm

||x||T ≡ sup0≤t≤T

|x(t)| . (4.7)

First consider the case of the topology of uniform convergence over bounded intervals. Weneed to show that, for any ε > 0, there exists a δ > 0 such that ||x1 − x2||T < ε when

21

|b1− b2|+ ||y1− y2||T < δ, where (yi, xi) are two pairs of functions satisfying the relation (4.5).From (4.5), we have

|x1(t)− x2(t)| ≤ |b1 − b2|+ |y1(t)− y2(t)|+∫ t

0|h(x1(s))− h(x2(s))| ds ,

≤ |b1 − b2|+ |y1(t)− y2(t)|+ c

∫ t

0|x1(s)− x2(s)| ds , t ≥ 0 . (4.8)

Suppose that |b1 − b2|+ ||y1 − y2||T ≤ δ. By Gronwall’s inequality,

|x1(t)− x2(t)| ≤ δect and ||x1 − x2||T ≤ δecT . (4.9)

Hence it suffices to let δ = εe−cT .We now turn to the Skorohod J1 topology; see §§3.3 and 11.5 and Chapter 12 of Whitt

(2002) for background. To treat this non-uniform topology, we will use the fact that thefunction x is necessarily bounded. That is proved later in Lemma 5.6. We want to showthat xn → x in D([0,∞),R, J1) when bn → b in R and yn → y in D([0,∞),R, J1). For ygiven, let the interval right endpoint T be a continuity point of y. Then there exist increasinghomeomorphisms λn of the interval [0, T ] such that ‖yn − y λn‖T → 0 and ‖λn − e‖T → 0 asn → ∞. Moreover, it suffices to consider homeomorphisms λn that are absolutely continuouswith respect to Lebesgue measure on [0, T ] having derivatives λn satisfying ‖λn − 1‖T → 0 asn → ∞. The fact that the topology is actually unchanged is a consequence of Billingsley’sequivalent complete metric d0 on pp 112–114 of Billingsley (1968). Hence, for y given, letM ≡ sup0≤t≤T |x(t)|. Since h in (4.4) is Lipschitz, we have

sup0≤t≤T

|h(x(t))| ≤ h(0) + sup0≤t≤T

|h(x(t))− h(0)| ≤ h(0) + cM = cM .

Thus we have

|xn(t)− x(λn(t))| ≤ |bn − b|+ ‖yn − y λn‖T +

∣∣∣∣∣∫ t

0h(xn(u)) du−

∫ λn(t)

0h(x(u)) du

∣∣∣∣∣≤ |bn − b|+ ‖yn − y λn‖T

+∣∣∣∣∫ t

0h(xn(u)) du−

∫ t

0h(x(λn(u)))λn(u) du

∣∣∣∣

≤ |bn − b|+ ‖yn − y λn‖T + ‖λn − 1‖T

∫ T

0|h(x(u))| du

+∫ t

0|h(xn(u))− h(x(λn(u)))| du

≤ |bn − b|+ ‖yn − y λn‖T + ‖λn − 1‖T (cMT )

+c

∫ t

0|xn(u)− x(λn(u))| du . (4.10)

Choose n0 such that ‖λn − 1‖T < δ/(2cMT and |bn − b| + ‖yn − y λn‖T < δ/2. ThenGronwall’s inequality implies that

|xn(t)− x(λn(t))| ≤ δect for all t, 0 ≤ t ≤ T ,

so that‖xn − x λn‖T ≤ δecT .

22

Hence, for ε > 0 given, choose δ < εe−cT to have ‖xn − x λn‖T ≤ ε for n ≥ n0. If necessary,choose n larger to make ‖λn − e‖T < ε and ‖λn − 1‖T < ε as well. Finally, for the inheritanceof continuity, note that

x(t + s)− x(t) = y(t + s)− y(t) +∫ t+s

th(x(u)) du ,

so that

|x(t + s)− x(t)| ≤ |y(t + s)− y(t)|+∫ t+s

t|h(x(u))| du .

Since x is bounded over [0, T ], x is continuous if y is continuous.In our case we can simply let h(s) = µs, but we will need the more complicated function h

in (4.5) and (4.4) in §7.1. To be self-contained, we now state a version of Gronwall’s inequality;see p. 498 of Ethier and Kurtz (1986). See §11 of Mandelbaum, Massey and Reiman (1998)for other versions of Gronwall’s inequality.

Lemma 4.1. (version of Gronwall’s inequality) Suppose that g : [0,∞) → [0,∞) is a Borel-measurable function such that

0 ≤ g(t) ≤ ε + M

∫ t

0g(s) ds, 0 ≤ t ≤ T , (4.11)

for some positive finite ε and M . Then

g(t) ≤ εeMt, 0 ≤ t ≤ T . (4.12)

It thus remains to establish the limit in (4.1). Our proof based on the first martingalerepresentation in Theorem 3.4 relies on a FCLT for the Poisson process and the CMT withthe composition map. The application of the CMT with the composition map requires a fluidlimit, which requires further argument. That is contained in subsequent sections.

4.2. A Poisson FCLT Plus the CMT

As a consequence of the last section, it suffices to show that the scaled martingales converge,as in (4.1). For that result, it is natural to directly apply the martingale FCLT in §7.1 of Ethierand Kurtz (1986), as reviewed here in §8. When doing so, it is necessary to show that thescaled predictable quadratic variation processes converge. That required step will be essentiallyequivalent to what we do here.

However, starting with the first martingale representation in 3.4, we do not need to applythe martingale FCLT. Instead, we can justify the martingale limit in (4.1) by yet anotherapplication of the CMT, using the composition map associated with the random time changes,in addition to a functional central limit theorem (FCLT) for scaled Poisson processes. Ourapproach also requires establishing a limit for the sequence of scaled predictable quadraticvariations associated with the martingales, so the main steps of the argument become thesame as when applying the martingale FCLT.

The FCLT for Poisson processes is a classical result. It is a special case of the FCLT fora renewal process, appearing as Theorem 17.3 in Billingsley (1968). It and generalizations arealso discussed extensively in Whitt (2002); see §§6.3, 7.3, 7.4, 13.7 and 13.8. The FCLT for aPoisson process can also be obtained via a strong approximation, as was done by Kurtz (1978),Mandelbaum and Pats (1995, 1998) and Mandelbaum, Massey and Reiman (1998). Finally,the FCLT for a Poisson process itself can be obtained as an easy application of the martingaleFCLT, as we show in §8.

23

We start with the scaled Poisson processes

MA,n(t) ≡ A(nt)− nt√n

and MS,n(t) ≡ S(nt)− nt√n

, t ≥ 0 . (4.13)

We employ the following basic FCLT: Since A and S are independent rate-1 Poisson pro-cesses, we have

Theorem 4.2. (FCLT for independent Poisson processes) If A and S are independent rate-1Poisson processes, then

(MA,n,MS,n) ⇒ (B1, B2) in D2 ≡ D ×D as n →∞ , (4.14)

where MA,n and MS,n are the scaled processes in (4.13), while B1 and B2 are independentstandard Brownian motions.

We can prove the desired limit in (4.1) for both martingale representations, but we willonly give the details for the first martingale representation in Theorem 3.4. In order to getthe desired limit in (4.1), we introduce a deterministic and a random time change. For thatpurpose, let e : [0,∞) → [0,∞) be the identity function in D, defined by e(t) ≡ t for t ≥ 0.Then let

ΦA,n(t) ≡ λnt

n= µt ≡ (µe)(t),

ΦS,n(t) ≡ µ

n

∫ t

0Qn(s) ds, t ≥ 0 . (4.15)

We will establish the following fluid limit, which can be regarded as a functional weaklaw of large numbers (FWLLN). Here below, and frequently later, we have convergence indistribution to a deterministic limit; that is equivalent to convergence in probability; seep. 27 of Billingsley (1999).

Lemma 4.2. (desired fluid limit) Under the conditions of Theorem 1.1,

ΦS,n ⇒ µe in D as n →∞ , (4.16)

where ΦS,n is defined in (4.15).

For that purpose, it suffices to establish another more basic fluid limit. Consider thestochastic process

ΨS,n(t) ≡ Qn(t)n

, t ≥ 0 . (4.17)

Let ω be the function that is identically 1 for all t.

Lemma 4.3. (basic fluid limit) Under the conditions of Theorem 1.1,

ΨS,n ⇒ ω in D as n →∞ , (4.18)

where ΨS,n is defined in (4.17) and ω(t) = 1, t ≥ 0.

24

Proof of Lemma 4.2. The desired fluid limit in Lemma 4.2 follows from the basic fluidlimit in Lemma 4.3 by applying the CMT with the function h : D → D defined by

h(x)(t) ≡ µ

∫ t

0x(s) ds , t ≥ 0 . (4.19)

Before proceeding, we remark that Lemma 4.2 can also be viewed as more elementary thanLemma 4.3, since the sample paths of ΦS,n are nondecreasing. By Lemma 8.1, it suffices toshow that ΨS,n(t) ⇒ µt in R as n →∞ for each t.

We thus have the following result

Lemma 4.4. (all but the fluid limit) If the limit in (4.18) holds, then

(Mn,1,Mn,2) ⇒ (√

µB1,√

µB2) in D2 (4.20)

as required to complete the proof of Theorem 1.1.

Proof. From the limit in (4.14), the desired fluid limit in (4.16) and Theorem 11.4.5 of Whitt(2002), it follows that

(MA,n, µe, MS,n, ΦS,n) ⇒ (B1, µe, B2, µe) in D4 ≡ D × · · · ×D as n →∞ . (4.21)

From the CMT with the composition map, as in §§3.4 and 13.2 of Whitt (2002) - in particular,with Theorem 13.2.1 - we obtain the desired limit in (4.1):

(Mn,1,Mn,2) ≡ (MA,n µe,MS,n ΦS,n) ⇒ (B1 µe,B2 µe) in D2 ≡ D×D as n →∞ ,(4.22)

By basic properties of Brownian motion,

(B1 µe,B2 µe) d= (√

µB1,√

µB2) in D2 . (4.23)

It thus remains to establish the key fluid limit in Lemma 4.3. In the next section we showhow to do that directly, without martingales, by applying the continuous mapping provided byTheorem 4.1 in the fluid scale or, equivalently, by applying Gronwall’s inequality again. Wewould stop there if we only wanted to analyze the M/M/∞ model, but in order to illustrateother methods used in Krichagina and Puhalskii (1997) and Puhalskii and Reiman (2000),we also apply martingale methods. Thus, in the subsequent four sections we show how toestablish that fluid limit using martingales. Here is an outline of the remaining martingaleargument:

(1) To prove the needed Lemma 4.3, it suffices to demonstrate that Xn is stochasticallybounded in D. (Combine Lemma 5.10 and §6.1.)

(2) However, Xn is stochastically bounded in D if the sequences of martingales Mn,1and Mn,2 are stochastically bounded in D. (Combine Theorem 3.4 and Lemma 5.6.)

(3) But then the sequences of martingales Mn,1 and Mn,2 are stochastically bounded inD if the associated sequences of predictable quadratic variations 〈Mn,1〉(t) and 〈Mn,2〉(t)are stochastically bounded in R for each t > 0 (Apply Lemma 5.9. One of these is trivialbecause it is deterministic.)

(4) Finally, we establish stochastic boundedness of 〈Mn,2〉(t) (the one nontrivial case)through a crude bound in §6.2.

This alternate route to the fluid limit is much longer, but all the steps might be consideredwell known. We remark that the fluid limit seems to be required by any of the proofs, includingby the direct application of the martingale FCLT.

25

4.3. The Direct Fluid Limit

In this section we prove Lemma 4.3 without using martingales. We do so by establishinga stochastic-process limit in the fluid scale which is similar to the corresponding stochastic-process limit with the more refined scaling. This is a standard line of reasoning for heavy-trafficstochastic-process limits; e.g., see the proofs of Theorems 9.3.4, 10.2.3 and 14.7.4 of Whitt(2002). The specific argument here follows §6 of Mandelbaum and Pats (1995).

By essentially the same reasoning as in §3.4, we obtain a fluid-scale analog of (3.27) and(3.28):

Xn(t) ≡ Qn(t)− n

n

=Qn(0)− n

n+

M∗n,1(t)n

− M∗n,2(t)n

+λnt− nµt

n− µ

∫ t

0

(Qn(s)− n

n

)ds ,

= Xn(0) + Mn,1(t)− Mn,2(t)− µ

∫ t

0Xn(s) ds , t ≥ 0 , (4.24)

where

Mn,1(t) ≡M∗

n,1(t)n

and Mn,2(t) ≡M∗

n,2(t)n

, t ≥ 0 , (4.25)

with M∗n,i(t) defined in (3.25).

Notice that the limitXn ⇒ η in Dk as n →∞ , (4.26)

whereη(t) ≡ 0, t ≥ 0 , (4.27)

is equivalent to the desired conclusion of Lemma 4.3. Hence we will prove the fluid limit in(4.26).

The assumed limit in (1.4) implies that Xn(0) ⇒ 0 in R as n →∞. We can apply Theorem4.1 or directly Gronwall’s inequality in Lemma 4.1 to deduce the desired limit (4.26) if we canestablish the following lemma.

Lemma 4.5. (fluid limit for the martingales) Under the conditions of Theorem 1.1,

Mn,i ⇒ η in D w.p.1 as n →∞ , (4.28)

for i = 1, 2, where Mn,i is defined in (4.25) and η is defined in (4.27).

Proof of Lemma 4.5. We can apply the SLLN for the Poisson process, which is equivalentto the more general functional strong law of large numbers (FSLLN); see §3.2 of Whitt (2002a).(Alternatively, we could apply the FWLLN, which is a corollary to the FCLT.) First, the SLLNfor the Poisson process states that

A(t)t

→ 1 andS(t)

t→ 1 w.p.1 as t →∞ , (4.29)

which implies the corresponding FSLLN’s

sup0≤t≤T

A(nt)n

− t → 0 and sup0≤t≤T

S(nt)n

− t → 0 w.p.1 as n →∞ (4.30)

for each T with 0 < T < ∞. We thus can treat Mn,1 directly. To treat Mn,2, we combine(4.30) with the crude inequality in (3.17) and the representation in (4.24)–(4.25) in order to

26

obtain the desired limit (4.28). To elaborate, the crude inequality in (3.17) implies that, forany T1 > 0, there exists T2 such that

P

(µ

∫ T1

0Qn(s) ds > T2

)→ 0 as n →∞ . (4.31)

That provides the key, because

P(‖Mn,2‖T1 > ε

) ≤ P

(µ

∫ T1

0Qn(s) ds > T2

)+ P

(‖Sn‖T2 > ε/2)

, (4.32)

whereSn(t) ≡ S(nt)− nt

n, t ≥ 0 .

5. Tightness and Stochastic Boundedness

5.1. Tightness

As indicated at the end of §4.2, we can also use a stochastic-boundedness argument in orderto establish the desired fluid limit. Since stochastic boundedness is closely related to tightness,we start by reviewing tightness concepts. In the next section we apply the tightness notionsto stochastic boundedness. The next three sections contain extra material not really neededfor the current proofs.

We work in the setting of a complete separable metric space (CSMS), also known as a Polishspace; see §§13 and 19 of Billingsley (1999), §§3.8-3.10 of Ethier and Kurtz (1986) and§§11.1and 11.2 of Whitt (2002). (The space Dk ≡ D([0,∞),R)k is made a CSMS in a standard wayand the space of probability measures on Dk becomes a CSMS as well.) Key concepts are:closed, compact, tight, relatively compact and sequentially compact. We assume knowledge ofmetric spaces and compactness in metric spaces.

Definition 5.1. (tightness) A set A of probability measures on a metric space S is tight if,for all ε > 0, there exists a compact subset K of S such that

P (K) > 1− ε for all P ∈ A .

A set of random elements of the metric space S is tight if the associated set of their probabilitylaws on S is tight. Consequently, a sequence Xn : n ≥ 1 of random elements of the metricspace S is tight if, for all ε > 0, there exists a compact subset K of S such that

P (Xn ∈ K) > 1− ε for all n ≥ 1 .

Since a continuous image of a compact subset is compact, we have the following lemma.

Lemma 5.1. (continuous functions of random elements) Suppose that Xn : n ≥ 1 is a tightsequence of random elements of the metric space S. If f : S → S′ is a continuous functionmapping the metric space S into another metric space S′, then f(Xn) : n ≥ 1 is a tightsequence of random elements of the metric space S′.

27

Proof. As before, let be used for composition: (f g)(x) ≡ f(g(x)). For any functionf : S → S′ and any subset A of S, A ⊆ f−1 f(A). Let ε > 0 be given. Since Xn : n ≥ 1 isa tight sequence of random elements of the metric space S, there exists a compact subset Kof S such that

P (Xn ∈ K) > 1− ε for all n ≥ 1 .

Then f(K) will serve as the desired compact set in S′, because

P (f(Xn) ∈ f(K)) = P (Xn ∈ (f−1 f)(K)) ≥ P (Xn ∈ K) > 1− ε for all n ≥ 1 .

We next observe that on products of separable metric spaces tightness is characterized bytightness of the components; see §11.4 of Whitt (2002).

Lemma 5.2. (tightness on product spaces) Suppose that (Xn,1, . . . , Xn,k) : n ≥ 1 is asequence of random elements of the product space S1 × · · · × Sk, where each coordinate spaceSi is a separable metric space. The sequence (Xn,1, . . . , Xn,k) : n ≥ 1 is tight if and only ifthe sequence Xn,i : n ≥ 1 is tight for each i, 1 ≤ i ≤ k.

Proof. The implication from the random vector to the components follows from Lemma 5.1because the component Xn,i is the image of the projection map πi : S1 × · · · × Sk → Si taking(x1, . . . , xk) into xi, and the projection map is continuous. Going the other way, we use thefact that

A1 × · · · ×Ak =k⋂

i=1

π−1i (Ai) =

k⋂

i=1

π−1i πi(A1 × · · · ×Ak)

for all subsets Ai ⊆ Si. Thus, for each i and any ε > 0, we can choose Ki such that P (Xn,i /∈Ki) < ε/k for all n ≥ 1. We then let K1 × · · · ×Kk be the desired compact for the randomvector. We have

P ((Xn,1, . . . , Xn,k) /∈ K1 × · · · ×Kk) = P

(k⋃

i=1

Xn,i /∈ Ki)≤

k∑

i=1

P (Xn,i /∈ Ki) ≤ ε .

Tightness goes a long way toward establishing convergence because of Prohorov’s theorem.It involves the notions of sequential compactness and relative compactness.

Definition 5.2. (relative compactness and sequential compactness) A subset A of a metricspace S is relatively compact if every sequence xn : n ≥ 1 from A has a subsequence thatconverges to a limit in S (which necessarily belongs to the closure A of A). A subset of S issequentially compact if it is closed and relatively compact.

We rely on the following basic result about compactness on metric spaces.

Lemma 5.3. (compactness coincides with sequential compactness on metric spaces) A subsetA of a metric space S is compact if and only if it is sequentially compact.

We can now state Prohorov’s theorem; see §11.6 of Whitt (2002). It relates compactness ofsets of measures to compact subsets of the underlying sample space S on which the probabilitymeasures are defined.

Theorem 5.1. (Prohorov’s theorem) A subset of probability measures on a CSMS is tight ifand only if it is relatively compact.

28

We have the following elementary corollaries:

Corollary 5.1. (convergence implies tightness) If Xn ⇒ X as n → ∞ for random elementsof a CSMS, then the sequence Xn : n ≥ 1 is tight.

Corollary 5.2. (individual probability measures) Every individual probability measure on aCSMS is tight.

As a consequence of Prohorov’s Theorem, we have the following method for establishingconvergence of random elements:

Corollary 5.3. (convergence in distribution via tightness) Let Xn : n ≥ 1 be a sequence ofrandom elements of a CSMS S. We have

Xn ⇒ X in S as n →∞if and only if (i) the sequence Xn : n ≥ 1 is tight and (ii) the limit of every convergentsubsequence of Xn : n ≥ 1 is the same fixed random element X (has a common probabilitylaw).

In other words, once we have established tightness, it only remains to show that the limitsof all converging subsequences must be the same. With tightness, we only need to uniquelydetermine the limit. When proving Donsker’s theorem, it is natural to uniquely determine thelimit through the finite-dimensional distributions. Convergence of all the finite-dimensionaldistributions is not enough to imply convergence on D, but it does uniquely determine thedistribution of the limit; see pp 20 and 121 of Billingsley (1968) and Example 11.6.1 in Whitt(2002).

We will apply this approach to prove the martingale FCLT in 8. In the martingale settingit is natural instead to use the martingale characterization of Brownian motion, originallyestablished by Levy (1948) and proved by Ito’s formula by Kunita and Watanabe (1967); seep. 156 of Karatzas and Shreve (1988), and various extensions, such as to continuous processeswith independent Gaussian increments, as in Theorem 1.1 on p. 338 of Ethier and Kurtz(1986). A thorough study of martingale characterizations appears in Chapter 4 of Liptser andShiryayev (1986) and in Chapters VIII and IX of Jacod and Shiryayev (1987).

5.2. Stochastic Boundedness

We start by defining stochastic boundedness and relating it to tightness. We then discusssituations in which stochastic boundedness is preserved. Afterwards, we give conditions for asequence of martingales to be stochastically bounded in D involving the stochastic boundednessof appropriate sequences of R-valued random variables. Finally, we show that the FWLLNfollows from stochastic boundedness.

5.2.1. Connection to Tightness

For random elements of R and Rk, stochastic boundedness and tightness are equivalent, buttightness is stronger than stochastic boundedness for random elements of the functions spacesC and D (and the associated product spaces Ck and Dk).

Definition 5.3. (stochastically boundedness for random vectors) A sequence Xn : n ≥ 1of random vectors taking values in Rk is stochastically bounded if the sequence is tight, asdefined in Definition 5.1.

29

The notions of tightness and stochastically boundedness thus agree for random elements ofRk, but these notions differ for stochastic processes. For a function x ∈ Dk ≡ D([0,∞),R)k,let

‖x‖T ≡ sup0≤t≤T

|x(t)| ,

where |b| is a norm of b ≡ (b1, b2, . . . , bk) in Rk inducing the Euclidean topology, such as themaximum norm: |b| ≡ max |b1|, |b2|, . . . , |bk|.Definition 5.4. (stochastically boundedness for random elements of Dk) A sequence Xn :n ≥ 1 of random elements of Dk is stochastically bounded in Dk if the sequence of real-valued random variables ‖Xn‖T : n ≥ 1 is stochastically bounded in R for each T > 0, usingDefinition 5.3.

For random elements of Dk, tightness is a strictly stronger concepts than stochastic bound-edness. Tightness of Xn in Dk implies stochastic boundedness, but not conversely; see §15of Billingsely (1968). However, stochastic boundedness is sufficient for us, because it aloneimplies the desired fluid limit.

5.2.2. Preservation

We have the following analog of Lemma 5.2, which characterizes tightness for sequences ofrandom vectors in terms of tightness of the associated sequences of components.

Lemma 5.4. (stochastic boundedness on Dk via components) A sequence (Xn,1, . . . , Xn,k) :n ≥ 1 of random elements of the product space Dk ≡ D× · · · ×D is stochastically bounded inDk if and only if the sequence Xn,i : n ≥ 1 is stochastically bounded in D ≡ D1 for each i,1 ≤ i ≤ k.

Proof. We can apply Lemma 5.2 after noticing that

‖(x1, . . . , xk)‖T = max ‖xi‖T : 1 ≤ i ≤ kfor each element (x1, . . . , xk) of Dk.

Lemma 5.5. (stochastic boundedness in Dk for sums) Suppose that

Yn(t) ≡ Xn,1(t) + · · ·+ Xn,k(t), t ≥ 0,

for each n ≥ 1, where (Xn,1, . . . , Xn,k) : n ≥ 1 is a sequence of random elements of theproduct space Dk ≡ D × · · · ×D. If Xn,i : n ≥ 1 is stochastically bounded in D for each i,1 ≤ i ≤ k, then the sequence Yn : n ≥ 1 is stochastically bounded in D.

Note that the converse is not true: We could have k = 2 with Xn,2(t) = −Xn,1(t) for all nand t. In that case we have Yn(t) = 0 for all Xn,1(t).

We now provide conditions for the stochastic boundedness of integral representations suchas (3.28).

Lemma 5.6. (stochastic boundedness for integral representations) Suppose that

Xn(t) ≡ Xn(0) + Yn,1(t) + · · ·+ Yn,k(t) +∫ t

0h(Xn(s)) ds, t ≥ 0 ,

where h is a Lipschitz function as in (4.4) and (Xn(0), (Yn,1, . . . , Yn,k)) is a random element ofR×Dk for each n ≥ 1. If the sequences Xn(0) : n ≥ 1 and Yn,i : n ≥ 1 are stochasticallybounded (in R and D, respectively,) for 1 ≤ i ≤ k, then the sequence Xn : n ≥ 1 isstochastically bounded in D.

30

Proof. For the proof here, it is perhaps easiest to imitate the proof of Theorem 4.1. Inparticular, we will construct the following bound

||Xn||T ≤ KecT ,

assuming that |Xn(0)|+||Yn,1||T +· · ·+||Yn,k||T ≤ K. As before, we apply Gronwall’s inequality,Lemma 4.1. Inserting absolute values into all the terms of the integral representation, we have

|Xn(t)| ≤ |Xn(0)|+ |Yn,1(t)|+ · · ·+ |Yn,k(t)|+∫ t

0|h(Xn(s))| ds,

≤ |Xn(0)|+ ‖Yn,1‖T + · · ·+ ‖Yn,k‖T + Th(0) + c

∫ t

0|Xn(s)| ds , (5.1)

for t ≥ 0. Suppose that |Xn(0)| + ||Yn,1||T + · · · + ||Yn,k||T + Th(0) ≤ K. By Gronwall’sinequality,

|Xn(t)| ≤ Kect and ||Xn||T ≤ KecT .

5.2.3. Stochastic Boundedness for Martingales

We now provide ways to get stochastic boundedness for sequences of martingales in D fromassociated sequences of random variables. Our first result exploits the classical submartingale-maximum inequality; e.g., see p. 13 of Karatzas and Shreve (1988). We say that a functionf : R→ R is even if f(−x) = f(x) for all x ∈ R.

Lemma 5.7. (stochastic boundedness from the maximum inequality) Suppose that, for eachn ≥ 1, Mn ≡ Mn(t) : t ≥ 0 is a martingale (with respect to a specified filtration) withsample paths in D. Also suppose that, for each T > 0, there exists an even nonnegative convexfunction f : R → R with first derivative f ′(t) > 0 for t > 0 (e.g., f(t) ≡ t2), there exists apositive constant K ≡ K(T, f), and there exists an integer n0 ≡ n0(T, f, K), such that

E[f(Mn(T ))] ≤ K for all n ≥ n0 .

Then the sequence of stochastic processes Mn : n ≥ 1 is stochastically bounded in D.

Proof. Since any set of finitely many random elements of D is automatically tight, Theorem1.3 of Billingsley (1999), it suffices to consider n ≥ n0. Since f is continuous and f ′(t) > 0 fort > 0, t > c if and only if f(t) > f(c) for t > 0. Since f is even,

E[f(Mn(t))] = E[f(|Mn(t)|)] ≤ E[f(|Mn(T )|)] = E[f(Mn(T ))] ≤ K for all t, 0 ≤ t ≤ T .

Since these moments are finite and f is convex, the stochastic process f(Mn(t)) : 0 ≤ t ≤ Tis a submartingale for each n ≥ 1, we can apply the submartingale-maximum inequality to get

P (‖Mn‖T > c) = P (‖f Mn‖T > f(c)) ≤ E[f(Mn(T ))]f(c)

≤ K

f(c)for all n ≥ n0 .

Since f(c) →∞ as c →∞, we have the desired conclusion.We now establish another sufficient condition for stochastic boundedness of square-integrable

martingales by applying the Lenglart-Rebolledo inequality; see p. 66 of Liptser and Shiryayev(1986) or p. 30 of Karatzas and Shreve (1998).

31

Lemma 5.8. (Lenglart-Rebolledo inequality) Suppose that M ≡ M(t) : t ≥ 0 is a square-integrable martingale (with respect to a specified filtration) with predictable quadratic variation〈M〉 ≡ 〈M〉(t) : t ≥ 0, i.e., such that M2 − 〈M〉 ≡ M(t)2 − 〈M〉(t) : t ≥ 0 is a martingaleby the Doob-Meyer decomposition. Then, for all c > 0 and d > 0,

P

(sup

0≤t≤T|M(t)| > c

)≤ d

c2+ P (〈M〉(T ) > d) . (5.2)

As a consequence we have the following criterion for stochastic boundedness of a sequenceof square-integrable martingales.

Lemma 5.9. (stochastic-boundedness criterion for square-integrable martingales) Suppose that,for each n ≥ 1, Mn ≡ Mn(t) : t ≥ 0 is a square-integrable martingale (with respect to a spec-ified filtration) with predictable quadratic variation 〈Mn〉 ≡ 〈Mn〉(t) : t ≥ 0, i.e., such thatM2

n − 〈Mn〉 ≡ Mn(t)2 − 〈Mn〉(t) : t ≥ 0 is a martingale by the Doob-Meyer decomposition.If the sequence of random variables 〈Mn〉(T ) : n ≥ 1 is stochastically bounded in R for eachT > 0, then the sequence of stochastic processes Mn : n ≥ 1 is stochastically bounded in D.

Proof. For ε > 0 given, apply the assumed stochastic boundedness of the sequence 〈Mn〉(T ) :n ≥ 1 to obtain a constant d such that

P (〈Mn〉(T ) > d) < ε/2 for all n ≥ 1 .

Then for that determined d, choose c such that d/c2 < ε/2. By the Lenglart-Rebolledoinequality (5.2), these two inequalities imply that

P

(sup

0≤t≤T|Mn(t)| > c

)< ε . (5.3)

5.2.4. FWLLN from Stochastic Boundedness

We will want to apply stochastic boundedness in D to imply the desired fluid limit in Lemmas4.2 and 4.3. The fluid limit corresponds to a functional weak law of large numbers(FWLLN).

Lemma 5.10. (FWLLN from stochastically boundedness in Dk) Let Xn : n ≥ 1 be asequence of random elements of Dk. Let an : n ≥ 1 be a sequence of positive real numberssuch that an → ∞ as n → ∞. If the sequence Xn : n ≥ 1 is stochastically bounded in Dk ,then

Xn

an⇒ η in Dk as n →∞ , (5.4)

where η(t) ≡ (0, 0, . . . , 0), t ≥ 0.

Proof. As specified in Definition 5.4, stochastic boundedness of the sequence Xn : n ≥ 1in Dk corresponds to stochastic boundedness of the associated sequence ‖Xn‖T : n ≥ 1 in Rfor each T > 0. By Definition 5.3, stochastic boundedness in R is equivalent to tightness. It iseasy to verify directly that we then have tightness (or, equivalently, stochastic boundedness)for the associated sequence ‖Xn‖T /an : n ≥ 1 in R. By Prohorov’s theorem, Theorem5.1, tightness on R (or any CSMS) is equivalent to relative compactness. Hence consider aconvergent subsequence ‖Xnk

‖T /ank: nk ≥ 1 of the sequence ‖Xn‖T /an : n ≥ 1 in R:

32

‖Xnk‖T /ank

→ L as nk → ∞. It suffices to show that P (L = 0) = 1; then all convergentsubsequences will have the same limit, which implies convergence to that limit. For thatpurpose, consider the associated subsequence ‖Xnk

‖T : nk ≥ 1 in R. It too is tight. Soby Prohorov’s theorem again, it too is relatively compact. Thus there exists a convergentsub-subsequence: ‖Xnkl

‖T ⇒ L′ in R. It follows immediately that

‖Xnkl‖T

an⇒ 0 in R as nkl

→∞ . (5.5)

This can be regarded as a consequence of the generalized continuous mapping theorem(GCMT), Theorem 3.4.4 of Whitt (2002), which involves a sequence of continuous functions:Consider the functions fn : R → R defined by fn(x) ≡ x/an, n ≥ 1, and the limiting zerofunction f : R → R defined by f(x) ≡ 0 for all x. It is easy to see that fn(xn) → f(x) ≡ 0whenever xn → x in R. Thus the GCMT implies the limit in (5.5). Consequently, the limit Lwe found for the subsequence ‖Xnk

‖T /ank: nk ≥ 1 must actually be 0. Since that must be

true for all convergent subsequences, we must have the claimed convergence in (5.4).

5.3. Tightness Criteria

The standard characterization for tightness of a sequence of stochastic processes in D,originally developed by Skorohod (1956) and presented in Billingsley (1968), involves suprema.Since then, more elementary one-dimensional criteria have been developed; see Billingsley(1974), Kurtz (1975), Aldous (1978, 1989), Jacod et al. (1983), §§3.8 and 3.9 of Ethier andKurtz (1986) and §16 of Billingsley (1999). Since these simplifications help in proving themartingale FCLT, we will present some of the results here.

5.3.1. Classic Criteria Involving a Modulus of Continuity

We start by presenting the classical characterization of tightness, as in Theorems 13.2 and16.8 of Billingsley (1999). For that purpose we define functions w and w′ that can serve as amodulus of continuity. For any x ∈ D and subset A of [0,∞), let

w(x,A) ≡ sup |x(t1)− x(t2)| : t1, t2 ∈ A . (5.6)

For any x ∈ D, T > 0 and δ > 0, let

w(x, δ, T ) ≡ sup w(x, [t1, t2]) : 0 ≤ t1 < t2 ≤ (t1 + δ) ∧ T (5.7)

andw′(x, δ, T ) ≡ inf

timax1≤i≤k

w(x, [ti−1, ti)) , (5.8)

where the infimum in (5.8) is over all k and all subsets of [0, T ] of size k + 1 such that

0 = t0 < t1 < · · · < tk = T with ti − ti−1 > δ for 1 ≤ i ≤ k − 1 .

(We do not require that tk − tk−1 > δ.)The following is a variant of the classical characterization of tightness; see Theorem 16.8

of Billingsley (1999).

Theorem 5.2. (classical characterization of tightness) A sequence of stochastic processes Xn :n ≥ 1 in D is tight if and only if

(i) The sequence Xn : n ≥ 1 is stochastically bounded in D

33

and(ii) for each T > 0 and ε > 0,

limδ↓0

lim supn→∞

P (w′(Xn, δ, T ) > ε) = 0 .

If the modulus w(Xn, δ, T ) is substituted for w′(Xn, δ, T ) in Condition (ii) of Theorem 5.2,then the sequence Xn is said to be C-tight, because then the sequence is again tight butthe limit of any convergent subsequence must have continuous sample paths; see Theorem 15.5of Billingsley (1968). With this modified condition (ii), condition (i) is implied by having thesequence Xn(0) be tight in R.

Conditions (i) and (ii) in Theorem 5.2 are both somewhat hard to verify because theyinvolve suprema. We have shown how the stochastic-boundedness condition (i) can be madeone-dimensional for martingales via Lemmas 5.7 and 5.9. The following is another result,which exploits the modulus condition (ii) in Theorem 5.2; see p. 175 of Billingsley (1999). Tostate it, let J be the maximum-jump function, defined for any x ∈ D and T > 0 by

J(x, T ) ≡ sup |x(t)− x(t−)| : 0 < t ≤ T . (5.9)

Lemma 5.11. (substitutes for stochastic boundedness) In the presence of the modulus con-dition (ii) in Theorem 5.2, each of the following is equivalent to the stochastic-boundednesscondition (i) in Theorem 5.2:

(i) The sequence Xn(t) : n ≥ 1 is stochastically bounded in R for each t in a dense subsetof [0,∞).

(ii) The sequence Xn(0) : n ≥ 1 is stochastically bounded in R and, for each T > 0, thesequence J(Xn, T ) : n ≥ 1 is stochastically bounded in R.

We also mention the role of the maximum-jump function J in (5.9) in characterizing con-tinuous limits; see Theorem 13.4 of Billingsley (1999):

Lemma 5.12. (identifying continuous limits) Suppose that Xn ⇒ X in D. Then P (X ∈ C) =1 if and only if J(Xn, T ) ⇒ 0 in R for each T > 0.

5.3.2. Simplifying the Modulus Condition

Simplifying the modulus condition (ii) in Theorem 5.2 is even of greater interest. We firstpresent results from Ethier and Kurtz (1986). Conditions (i) and (iii) below are of particularinterest because they shows that, for each n and t, we need only consider the process at thetwo single time points t+u and t−v for u > 0 and v > 0. As we will see in Lemma 5.13 below,we can obtain a useful sufficient condition involving only the single time point t + u (ignoringt− v).

For the results in this section, we assume that the strong stochastic-boundedness condition(i) in Theorem 5.2 is in force. For some of the alternatives to the modulus condition (ii) inTheorem 5.2 it is also possible to simplify the stochastic-boundedness condition, as in Lemma5.11, but we do not carefully examine that issue.

Theorem 5.3. (substitutes for the modulus condition) In the presence of the stochastic-boundedness condition (i) in Theorem 5.2, each of the following is equivalent to themodulus condition (ii):

34

(i) For each T > 0, there exists a constant β > 0 and a family of nonnegative randomvariables Zn(δ, T ) : n ≥ 1, δ > 0 such that

E[(1 ∧ |Xn(t + u)−Xn(t)|)β | Fn,t

] ((1 ∧ |Xn(t)−Xn(t− v)|)β

)≤ E [Zn(δ, T )|Fn,t] (5.10)

w.p.1 for 0 ≤ t ≤ T , 0 ≤ u ≤ δ and 0 ≤ v ≤ t ∧ δ, where Fn,t is the σ-field in the internalfiltration of Xn(t) : t ≥ 0,

limδ↓0

lim supn→∞

E[Zn(δ, T )] = 0 . (5.11)

andlimδ↓0

lim supn→∞

E[(1 ∧ |Xn(δ)−Xn(0)|)β] = 0 . (5.12)

(ii) The sequence of stochastic processes f(Xn(t)) : t ≥ 0 : n ≥ 1 is tight in D for eachfunction f : R → R in a dense subset of all bounded continuous functions in the topology ofuniform convergence over bounded intervals.

(iii) For each function f : R→ R in a dense subset of all bounded continuous functions inthe topology of uniform convergence over bounded intervals, and T > 0, there exists a constantβ > 0 and a family of nonnegative random variables Zn(δ, f, T ) : n ≥ 1, δ > 0 such that

E[|f(Xn(t + u))− f(Xn(t))|β | Fn,f,t

] (|f(Xn(t))− f(Xn(t− v))|β

)≤ E [Zn(δ, f, T )|Fn,f,t]

(5.13)w.p.1 for 0 ≤ t ≤ T , 0 ≤ u ≤ δ and 0 ≤ v ≤ t ∧ δ, where Fn,f,t is the σ-field in the internalfiltration of f(Xn(t)) : t ≥ 0, and

limδ↓0

lim supn→∞

E[Zn(δ, f, T )] = 0 . (5.14)

andlimδ↓0

lim supn→∞

E[|f(Xn(δ))− f(Xn(0))|β] = 0 . (5.15)

Proof. Condition (i) is condition (b) in Theorem 3.8.6 of Ethier and Kurtz (1986) for the caseof real-valued stochastic processes. Condition (ii) is Theorem 3.9.1 of Ethier and Kurtz (1986).Condition (iii) is a modification of condition (b) in Theorem 3.8.6 of Ethier and Kurtz (1986),exploiting condition (ii) and the fact that our stochastic processes are real valued. Since thefunctions f are bounded in (iii), there is no need to replace the usual metric m(c, d) ≡ |c− d|with the bounded metric q ≡ m ∧ 1.

In applications of Theorem 5.3, it is advantageous to replace Conditions (i) and (iii) inTheorem 5.3 with stronger sufficient conditions, which only involves the conditional expectationon the left. By doing so, we need to consider only conditional distributions of the processes ata single time point t+u with u > 0, given the relevant history Fn,t up to time time t. We needto do an estimate for all t and n, but given specific values of t and n, we need to consider onlyone future time point t + u for u > 0. These simplified conditions are sufficient, butnot necessary, for D-tightness. On the other hand, they are not sufficient for C-tightness;see Remark 5.1 below.

In addition, for condition (iii) in Theorem 5.3, it suffices to specify a specific dense familyof functions. As in the Ethier-Kurtz proof of the martingale FCLT (their Theorem 7.1.4), weintroduce smoother functions in order to exploit Taylor series expansions. Indeed, we presentthe versions of Theorem 5.3 actually applied by Ethier and Kurtz (1986) in their proof of theirTheorem 7.1.4.

35

Lemma 5.13. (simple sufficient criterion for tightness) The following are sufficent, but notnecessary, conditions for a sequence of stochastic processes Xn : n ≥ 1 in D to be tight:


and either(ii.a) For each n ≥ 1, the stochastic process Xn is adapted to a filtration Fn ≡ Fn,t : t ≥

0. In addition, for each function f : R → R belonging to C∞c (having compact support andderivatives of all orders) and T > 0, there exists a family of nonnegative random variablesZn(δ, f, T ) : n ≥ 1, δ > 0 such that

|E [f(Xn(t + u))− f(Xn(t)) | Fn,t]| ≤ E [Zn(δ, f, T ) | Fn,t] (5.16)

w.p.1 for 0 ≤ t ≤ T and 0 ≤ u ≤ δ and

limδ↓0

lim supn→∞

E[Zn(δ, f, T )] = 0 . (5.17)

or(ii.b) For each n ≥ 1, the stochastic process Xn is adapted to a filtration Fn ≡ Fn,t :

t ≥ 0. In addition, for each T > 0, there exists a family of nonnegative random variablesZn(δ, T ) : n ≥ 1, δ > 0 such that

∣∣∣E[(Xn(t + u)−Xn(t))2 | Fn,t

]∣∣∣ ≤ E [Zn(δ, T ) | Fn,t] (5.18)

w.p.1 for 0 ≤ t ≤ T and 0 ≤ u ≤ δ and

limδ↓0

lim supn→∞

E[Zn(δ, T )] = 0 . (5.19)

Proof. We apply Theorem 5.3. For condition (ii.a), we apply Theorem 5.3 (iii). We havespecified a natural class of functions f that are dense in the set of all continuous bounded real-valued functions with the topology of uniform convergence over bounded intervals. We havechanged the filtration. Theorem 5.3 and Theorem 3.8.6 of Ethier and Kurtz (1986) specify theinternal filtration of the process, which here would be the process f(Xn(t)) : t ≥ 0, but it isconvenient to work with the more refined filtration associated with Xn. By taking conditionalexpected values, conditional on the coarser filtration generated by f(Xn(t)) : t ≥ 0, we candeduce the corresponding inequality conditioned on the coarser filtration. The conditions herefollow from condition (5.10) by taking β = 2, because

E[|f(Xn(t + u))− f(Xn(t))|2|Fn,t] = E[f(Xn(t + u))2 − f(Xn(t))2|Fn,t] (5.20)−2f(Xn(t))E[f(Xn(t + u))− f(Xn(t))|Fn,t] ,

(see (1.35) on p. 343 of EK), so that

E[|f(Xn(t + u))− f(Xn(t))|2|Fn,t] ≤ E[Zn(δ, f2, T )|Fn,t] + 2‖f‖E[Zn(δ, f, T )|Fn,t] (5.21)

under condition (5.16). Note that f2 belongs to our class of functions for every function f inthe class. Finally, note that under (5.16)

E[|f(Xn(δ))− f(Xn(0))|2] ≤ E[Zn(δ, f, T )] , (5.22)

so that the limit (5.15) holds by (5.17).

36

Now consider condition (ii.b). Note this is a direct consequence of Theorem 5.3 (i) usingβ = 2. Condition (5.20) is made stronger than (5.10) by removing the ∧1. As in (5.22),

E[|Xn(δ)−Xn(0)|2 ≤ E[Zn(δ, T )] , (5.23)

so that the limit (5.12) holds by (5.19).We have mentioned that Theorem 5.3 and Lemma 5.13 come from Ethier and Kurtz (1986).

As indicated there, these in turn come from Kurtz (1975). However, Lemma 5.13 is closelyrelated to a sufficient condition for tightness proposed by Billingsley (1974). Like Lemma 5.13,this sufficient condition is especially convenient for treating Markov processes. We state aversion of that here. For that purpose, let αn(λ, ε, δ, T ) be a number such that

P (|Xn(u)−Xn(tm)| > ε|Xn(t1), · · · , Xn(tm)) ≤ αn(λ, ε, δ, T ) (5.24)

holds with probability 1 on the set maxi|Xn(ti)| ≤ λ for all m and all m time points ti with

0 ≤ t1 ≤ · · · ≤ tm−1 ≤ tm < u ≤ T and u− tm ≤ δ . (5.25)

The following is a variant of Theorem 1 of Billingsley (1974). (He has the stochastic-boundednesscondition (i) below replaced by a weaker condition.)

Lemma 5.14. (another simple sufficient condition for tightness) Alternative sufficent, but notnecessary, conditions for a sequence of stochastic processes Xn : n ≥ 1 in D to be tight arethe following:


and(ii) For each T > 0, ε > 0 and λ < ∞,

limδ↓0

lim supn→∞

αn(λ, ε, δ, T ) = 0 , (5.26)

for αn(λ, ε, δ, T ) defined in (5.24).

Remark 5.1. (sufficient for D-tightness but not C-tightness.) The sufficient conditions inLemmas 5.13 and 5.14 are sufficient but not necessary for D tightness. On the other hand,these conditions are not sufficient for C-tightness. To substantiate these claims, it suffices toconsider simple examples. The single function x(t) = 1[1,∞)(t), t ≥ 0, is necessarily tight inD, but the conditions in Lemmas 5.13 and 5.14 are not satisfied for it. On the other hand,the stochastic process X(t) = 1[T,∞)(t), t ≥ 0, where T is an exponential random variablewith mean 1, is D tight, but not C-tight. By the lack of memory property of the exponentialdistribution, the conditions of Lemmas 5.13 and 5.14 are satisfied for this simple randomelement of D.

5.3.3. Criteria Involving Stopping Times and Quadratic Variations

We conclude this section by mentioning alternative criteria for tightness specifically intendedfor martingales. These criteria involve stopping times and the quadratic-variation processes.The criteria in terms of stopping times started with Aldous (1978) and Rebolledo (1980).Equivalence with the conditions in Theorem 5.3 and Lemma 5.13 was shown in Theorems 2.7of Kurtz (1981) and 3.8.6 of Ethier and Kurtz (1986); See also p 176 of Billingsley (1999).These criteria are very natural for proving tightness of martingales, as Aldous (1978, 1989)and Rebolledo (1980) originally showed. Aldous’ (1978) proof of (a generalization of) themartingale FCLT from McLeish (1974) is especially nice.

37

Theorem 5.4. (another substitute for the modulus condition) Suppose that the stochastic-boundedness condition (i) in Theorem 5.2 holds.

The following is equivalent to the modulus condition in Theorem 5.2, condition (ii) inTheorem 5.3 and thus to tightness:

For each T > 0, there exists a constant β > 0 such that

limδ↓0

lim supn→∞

Cn(δ, T ) = 0 , (5.27)

where

Cn(δ, T ) ≡ supτ∈Sn,T

sup0≤u≤δ

E[(1 ∧ |Xn(τ + u)−Xn(τ)|)β

((1 ∧ |Xn(τ)−Xn(τ − v)|)β

)]

(5.28)for 0 ≤ t ≤ T , 0 ≤ u ≤ δ and τ − v ≥ 0, with Sn,T being the collection of all finite-valuedstopping times with respect to the internal filtration of Xn, bounded by T .

Theorem 5.5. (another substitute for the sufficient condition) Suppose that the stochastic-boundedness condition (i) in Theorem 5.2 holds.

Then the following are equivalent sufficient, but not necessary, conditions for the moduluscondition (ii) in Theorem 5.2, condition (ii) in Theorem 5.3 and thus for tightness:

(i) For each T > 0 and sequence (τn, δn) : n ≥ 1, where τn is a finite-valued stoppingtime with respect to the internal filtration generated by Xn, with τn ≤ T and δn is a positiveconstant with δn ↓ 0,

Xn(τn + δn)−Xn(τn) ⇒ 0 as n →∞ . (5.29)

(ii) For each ε > 0, η > 0 and T > 0, there exist δ > 0, n0 and two sequences τn,i : n ≥n0, i = 1, 2, where τn,i is a stopping time with respect to the internal filtration generated byXn such that 0 ≤ τn,1 < τn,2 ≤ T and

P (Xn(τn,2)−Xn(τn,1) > ε, τn,2 − τn,1 ≤ δ) < η for all n ≥ n0 . (5.30)

Aldous (1978) showed that the two conditions in Lemma 5.14 imply condition (5.29),because

P (Xn(τn + δn)−Xn(τn) > ε) ≤ αn(λ, ε, δn, T ) + P (‖Xn‖T > λ) . (5.31)

Related criteria for tightness in terms of quadratic variation processes are presented in Re-bolledo (1980), Jacod et al. (1983) and §§VI.4-5 of Jacod and Shiyayev (1987). The following isa minor variant of Theorem VI.4.13 on p 322 of Jacod and Shiryayev (1987). Lemma 5.2 showsthat C-tightness of the sum

∑ki=1〈Xn,i, Xn,i〉 is equivalent to tightness of the components.

Theorem 5.6. (tightness criterion in terms of the angle-bracket process) Suppose that Xn ≡(Xn,1, . . . , Xn,1) is a locally-square-integrable martingale in Dk for each n ≥ 1, so that thepredictable quadratic covariation processes 〈Xn,i, Xn,j〉 are well defined. The sequence Xn :n ≥ 1 is C-tight if

(i) The sequence Xn,i(0) : n ≥ 1 is tight in R for each i, and(ii) the sequence 〈Xn,i, Xn,i〉 is C-tight for each i.

The following is a minor variant of Lemma 11 of Rebolledo (1980). We again apply Lemma5.2 to treat the vector case. Since a local martingale with bounded jumps is locally squareintegrable, the hypothesis of the next theorem is not weaker than the hypothesis of the previoustheorem.

38

Theorem 5.7. (tightness criterion in terms of the square-bracket processes) Suppose thatXn ≡ (Xn,1, . . . , Xn,1) is a local martingale in Dk for each n ≥ 1. The sequence Xn : n ≥ 1is C-tight if

(i) For all T > 0, there exists n0 such that

J(Xn, T ) ≡ ‖∆Xn,i‖T ≡ sup0≤t≤T

|Xn(t)−Xn(t−)| ≤ bn for all i and n ≥ n0 , (5.32)

where bn are real numbers such that bn → 0 as n →∞; and(ii) The sequence of optional quadratic variation processes [Xn,i] is C-tight for each i.

In closing we remark that these quadratic-variation tightness conditions in Theorems 5.6and 5.7 are often easy to verify because these are nondecreasing processes; see Lemma 8.1.

6. Completing the Proof of Theorem 1.1.

In the next three subsections we complete the proof of Theorem 1.1. In §6.1 we showthat the required fluid limit in Lemma 4.3 follows from the stochastic boundedness of the thesequence of stochastic processes Xn : n ≥ 1. In §6.2 we finish the proof of the fluid limitby proving that the associated predictable quadratic variation processes are stochasticallybounded. Finally, in §6.3 we show how to remove the condition on the initial conditions.

6.1. The Fluid Limit from Stochastic Boundedness of Xn in D

We now show how to apply stochastic boundedness to imply the desired fluid limit inLemma 4.3. Here we simply apply Lemma 5.10 to our particular sequence of stochastic pro-cesses Xn : n ≥ 1, with


, t ≥ 0, (6.1)

as defined in (3.28) and an ≡√

n for n ≥ 1.

Lemma 6.1. (application to queueing processes) Let Xn be the random elements of D definedin (6.1) or (3.28). If the sequence Xn : n ≥ 1 is stochastically bounded in D, then

‖n−1Qn − ω‖T ⇒ 0 as n →∞ for all T > 0 , (6.2)

where ω(t) ≡ 1, t ≥ 0. Equivalently,

sup0≤t≤T

|(Qn(t)/n)− 1| ⇒ 0 as n →∞ for all T > 0 . (6.3)

Proof. As a consequence of Lemma 5.10, From the stochastic boundedness of Xn in D,we will obtain

Xn√n⇒ η in D as n →∞ , (6.4)

where η is the zero function defined above. The limit in (6.4) is equivalent to (6.2) and (6.3).In §4.2 we already observed that Lemma 4.3 implies the desired Lemma 4.2. So we have

completed the required proof.

39

Recap: How this all goes together. The consequence of the last three sections is asfollows: In order to complete the martingale argument to establish the required fluid limit inLemmas 4.2 and 4.3, it suffices to establish the stochastic boundedness of two sequences ofrandom variables: 〈Mn,1〉(T ) : n ≥ 1 and 〈Mn,2〉(T ) : n ≥ 1 for any T > 0.

Here is an explanation: For each n ≥ 1, the associated stochastic processes 〈Mn,1〉(t) : t ≥0 and 〈Mn,2〉(t) : t ≥ 0 are the predictable quadratic variations (compensators) of the scaledmartingales Mn,1 and Mn,2 in (3.25), (3.26) and (4.1). First, by Lemma 5.9, the stochasticboundedness of these two sequences of PQV random variables implies stochastic boundednessin D of the two sequences of scaled martingales Mn,1 : n ≥ 1 and Mn,2 : n ≥ 1 in (3.26).That, in turn, under condition (1.4), implies the stochastic boundedness in D of the sequenceof scaled queue-length processes Xn : n ≥ 1 in (3.28) and (6.1) by Lemma 5.6. Finally, byLemma 5.10 the stochastic boundedness of Xn in D implies the required fluid limit in (6.2).The fluid limit in (6.2) was just what was needed in Lemmas 4.2 and 4.3.

6.2. Stochastic Boundedness of the Predictable Quadratic Variations

We have observed at the end of the last section that it only remains to establish thestochastic boundedness of two sequences of PQV random variables: 〈Mn,1〉(t) : n ≥ 1 and〈Mn,2〉(t) : n ≥ 1 for any t > 0. For each n ≥ 1, the associated stochastic processes〈Mn,1〉(t) : t ≥ 0 and 〈Mn,2〉(t) : t ≥ 0 are the predictable quadratic variations (com-pensators) of the scaled martingales Mn,1 and Mn,2 in (3.30). We note that the stochasticboundedness of 〈Mn,1〉(t) : t ≥ 0 is trivial because it is deterministic.

Lemma 6.2. (stochastic boundedness of 〈Mn,2〉 Under the assumptions in Theorems 1.1and 3.4, the sequence 〈Mn,2〉(t) : n ≥ 1) is stochastically bounded for each t > 0, where

〈Mn,2〉(t) ≡ µ

n

∫ t

0Qn(s) ds, t ≥ 0 .

Proof. It suffices to apply the crude inequality in Lemma 3.3:

µ

n

∫ t

0Qn(s) ds ≤ µt

n(Qn(0) + A(λnt)) .

By Lemma 5.5, it suffices to show that the two sequences Qn(0)/n and A(λnt)/n arestochastically bounded. By condition (1.4) and the CMT, we have the WLLN’s

Qn(0)− n

n⇒ 0 as n →∞ so that

Qn(0)n

⇒ 1 as n →∞ ,

but that convergence implies stochastic boundedness; see Corollary 5.1. Turning to the otherterm, since λn = nµ, we have

A(λnt)n

→ µt as n →∞ w.p.1

by the SLLN for Poisson processes. But convergence w.p.1 implies convergence in distribution,which in turn implies stochastic boundedness; again apply Corollary 5.1. Hence we havecompleted the proof.

40

6.3. The Initial Conditions

The results in §§3–6.1 have used the finite moment condition E[Qn(0)] < ∞, but wewant to establish Theorem 1.1 without this condition. In particular, note that this momentcondition appears prominently in Lemmas 3.4 and 3.5 and in the resulting final martingalerepresentations for the scaled processes, e.g., as stated in Theorems 3.4 and 3.5.

However, for the desired Theorem 1.1, we do not need to directly impose this momentcondition. We do need the assumed convergence of the scaled initial conditions in (1.4), butwe can circumvent this moment condition by defining bounded initial conditions that convergeto the same limit. For that purpose, we can work with

Qn(0) ≡ Qn(0) ∧ 2n, n ≥ 0 , (6.5)

and

Xn(0) ≡ Qn(0)− n√n

, n ≥ 1 . (6.6)

Then, for each n ≥ 1, we use Qn(0) and Xn(0) instead of Qn(0) and Xn(0). We then obtainTheorem 1.1 for these modified processes:

Xn ⇒ X in D as n →∞ . (6.7)

However,P (Xn 6= Xn) → 0 as n →∞ . (6.8)

Hence, we have Xn ⇒ X as well.

7. Other Models

In this section we discuss how to treat other models closely related to our initial M/M/∞model. We consider the Erlang-A model in §7.1. We consider finite waiting rooms in §7.2.Finally, we indicate how to treat general non-Poisson arrival processes in §7.3.

7.1. The Erlang A Model

In this section we prove the corresponding many-server heavy-traffic limit for the M/M/n/∞+M (or Erlang-A) model in the QED regime. As before, the arrival rate is λ and the individualservice rate is µ. Now there are n servers and unlimited waiting room with the FCFS servicediscipline. Customer times to abandon are i.i.d. exponential random variables with a mean of1/θ. Thus individual customers waiting in queue abandon at a constant rate θ. The Erlang-Cmodel arises when there is no abandonment, which occurs when θ = 0. The Erlang C modelis covered as a special case of the result below.

Let Q(t) denote the number of customers in the system at time t, either waiting or beingserved. It is well known that the stochastic process Q ≡ Q(t) : t ≥ 0 is a birth-and-death stochastic process with constant birth rate λk = λ and state-dependent death rateµk = (k∧n)µ+(k−n)+θ, k ≥ 0, where a∧ b ≡ min a, b, a∨ b ≡ max a, b and (a)+ ≡ a∨ 0for real numbers a and b.

As in Theorem 1.1, the many-server heavy-traffic limit involves a sequence of Erlang-Aqueueing models. As before, we let this sequence be indexed by n, but now this n coincideswith the number of servers. Thus now we are letting the number of servers be finite, but thenletting that number grow. At the same time, we let the arrival rate increase. As before, welet the arrival rate in model n be λn, but now we stipulate that λn grows with n. At the same

41

time, we hold the individual service rate µ and abandonment rate θ fixed. Let ρn ≡ λn/nµ bethe traffic intensity in model n. We stipulate that

(1− ρn)√

n → β as n →∞ , (7.1)

where β a (finite) real number. That is equivalent to assuming that

nµ− λn√n

→ βµ as n →∞ , (7.2)

as in (1.8). Conditions (7.1) and (7.2) are known to characterize the QED many-server heavy-traffic regime; see Halfin and Whitt (1981) and Puhalskii and Reiman (2000). The many-serverheavy-traffic limit theorem for the Erlang-A model was proved by Garnett, Mandelbaum andReiman (2002), exploiting Stone (1963). See Zeltyn and Mandelbaum (2005) and Mandelbaumand Zeltyn (2005) for extensions and elaboration. For related results for single-server modelswith customer abandonment, see Ward and Glynn (2003a,b, 2005).

Here is the QED many-server heavy-traffic limit theorem for the M/M/n/∞+ M model:

Theorem 7.1. (heavy-traffic limit in D for the M/M/n + M model) Consider the sequenceof M/M/n/∞ + M models defined above, with the scaling in (7.1). Let Xn be as defined in(1.3). If

Xn(0) ⇒ X(0) in R as n →∞ , (7.3)


where X is the diffusion process with infinitesimal mean m(x) = −βµ − µx for x < 0 andm(x) = −βµ − θx for x > 0, and infinitesimal variance σ2(x) = 2µ. Alternatively, the limitprocess X satisfies the stochastic integral equation

X(t) = X(0)− µβt +√

2µB(t)−∫ t

0[µ(X(s) ∧ 0) + θ(X(s) ∨ 0)] ds , t ≥ 0 , (7.5)

where B ≡ B(t) : t ≥ 0 is a standard Brownian motion. Equivalently, X satisfies the SDE

dX(t) = −µβ − µ(X(t) ∧ 0)dt− θ(X(t) ∨ 0)dt +√

2µdB(t), t ≥ 0 . (7.6)

Proof. In the rest of this section we very quickly summarize the proof; the argument mostlydiffers little from what we did before. Indeed, if we use the second martingale representationas in §§2.2 and 3.5, then there is very little difference. However, if we use the first martingalerepresentation, as in §§2.1 and 3.4, then there is a difference, because now we want to use theoptional sampling theorem for multiparameter random time changes, as in §§2.8 and 6.2 ofEthier and Kurtz (1986). That approach follows Kurtz (1980), which draws on Helms (1974).That approach has been applied in §12 of Mandelbaum and Pats (1998). To illustrate thisalternate approach, we use the random time change approach here.

Just as in §2.1, we can construct the stochastic process Q in terms of rate-1 Poissonprocesses. In addition to the two Poisson processes A and S introduced before, now we havean extra rate-1 Poisson process R used to generate abandonments. Instead of (2.2), here wehave representation

Q(t) ≡ Q(0) + A(λt)−D(t)− L(t), t ≥ 0 ,

= Q(0) + A(λt)− S

(µ

∫ t

0(Q(s) ∧ n) ds

)−R

(θ

∫ t

0(Q(s)− n)+ ds

), (7.7)

42

for t ≥ 0, where D(t) is the number of departures (service completions) in the time interval [0, t],while L(t) is the number of customers lost because of abandonment in the time interval [0, t].Since there are only n servers, the instantaneous overall service rate at time s is µ(Q(s) ∧ n),while the instantaneous overall abandonment rate (which is only from waiting customers, notyet in service) at time s is θ(Q(s)− n)+.

Paralleling (2.5), we have the martingale representation

Q(t) = Q(0) + M1(t)−M2(t)−M3(t) + λt− µ

∫ t

0(Q(s) ∧ n) ds− θ

∫ t

0(Q(s)− n)+ ds (7.8)

for t ≥ 0, where

M1(t) ≡ A(λt)− λt,

M2(t) ≡ S

(µ

∫ t

0(Q(s) ∧ n) ds

)− µ

∫ t

0(Q(s) ∧ n) ds ,

M3(t) ≡ R

(θ

∫ t

0(Q(s)− n)+ ds

)− θ

∫ t

0(Q(s)− n)+ ds, t ≥ 0 , (7.9)

and the filtration is F ≡ Ft : t ≥ 0 defined by

Ft ≡ σ

(Q(0), A(λs), S

(µ

∫ s

0(Q(u) ∧ n) du

), R

(θ

∫ s

0(Q(u)− n)+ du

): 0 ≤ s ≤ t

),

(7.10)for t ≥ 0, augmented by including all null sets.

We now want to justify the claims in (7.8)–(7.10). Just as before, we can apply Lemmas 3.1and 3.3 to justify this martingale representation, but now we need to replace Lemmas 3.2 and3.4 by corresponding lemmas involving the optional sampling theorem with multiparameterrandom time changes, as in §§2.8 and 6.2 of Ethier and Kurtz (1986). We now sketch theapproach: We start with the three-parameter filtration

H ≡ H(t1, t2, t3) ≡ σ (Q(0), A(s1), S(s2), R(s3) : 0 ≤ s1 ≤ t1, 0 ≤ s2 ≤ t2, 0 ≤ s3 ≤ t3) (7.11)

augmented by adding all null sets. Next introduce the three nondecreasing nonnegative stochas-tic processes

I1(t) ≡ λt,

I2(t) ≡ µ

∫ t

0(Q(s) ∧ n) ds,

I3(t) ≡ θ

∫ t

0(Q(s)− n)+ ds, t ≥ 0 . (7.12)

Then observe that the vector (I1(t), I2(t), I3(t)) is an H-stopping time. (This is essentially bythe same arguments as we used in Lemma 3.4.) Moreover, since A, S and R are assumed tobe independent rate-1 Poisson processes, the stochastic process

M ≡ (M1, M2, M3) ≡ (M1(s1), M2(s2), M3(s3) : s1 ≥ 0, s2 ≥ 0, s3 ≥ 0where

M1(t) ≡ A(t)− t,

M2(t) ≡ S(t)− t,

M3(t) ≡ R(t)− t, t ≥ 0, (7.13)

43

is an H-multiparameter martingale. As a consequence of the optional sampling theorem,Theorem 8.7 on p. 87 of Ethier and Kurtz,

(M1 I1, M2 I2, M3 I3) ≡ M1(I1(t)), M2(I2(t)), M3(I3(t)) : t ≥ 0 (7.14)

is a martingale with respect to the filtration F ≡ Ft : t ≥ 0 in (7.10), because

H(I1(t), I2(t), I3(t))= σ (Q(0), A(s1), S(s2), R(s3) : 0 ≤ s1 ≤ I1(t), 0 ≤ s2 ≤ I2(t), 0 ≤ s3 ≤ I3(t))

= σ

(Q(0), A (λs) , S

(µ

∫ s

0(Q(u) ∧ n) du

),

R

(µ

∫ t

0(Q(s)− n)+ du

): 0 ≤ s ≤ t

)

= Ft for all t ≥ 0 . (7.15)

As in §3.4, we use the crude inequality in Lemma 3.3 to guarantee that the momentconditions are satisfied:

E[Ij(t)] < ∞ and E[Mj(t)] < ∞ for j = 2, 3 . (7.16)

Just as before, we then consider the sequence of models indexed by n. Just as in (3.24)-(3.27), we define associated scaled processes:

Mn,1(t) ≡ n−1/2 [A(λnt)− λnt] ,

Mn,2(t) ≡ n−1/2

[S

(µ

∫ t

0(Qn(s) ∧ n) ds

)− µ

∫ t

0(Qn(s) ∧ n) ds

],

Mn,3(t) ≡ n−1/2

[R

(θ

∫ t

0(Qn(s)− n)+ ds

)− θ

∫ t

0(Qn(s)− n)+ ds

], (7.17)

for t ≥ 0.We thus obtain the following analog of Theorem 3.4:

Theorem 7.2. (first martingale representation for the scaled processes in the Erlang A model)If E[Qn(0)] < ∞, then the scaled processes have the martingale representation

Xn(t) ≡ Xn(0) + Mn,1(t)−Mn,2(t)−Mn,3(t) +(λn − µn)t√

n

−∫ t

0

[µ(Xn(s) ∧ 0) + θXn(s)+

]ds, t ≥ 0 , (7.18)

where Mn,i are the scaled martingales in (7.17). These processes Mn,i are square-integrablemartingales with respect to the filtrations Fn ≡ Fn,t : t ≥ 0 defined by

Fn,t ≡ σ

(Qn(0), A(λns), S

(µ

∫ s

0(Qn(u) ∧ n) du

),

R

(θ

∫ s

0(Qn(u)− n)+ du

): 0 ≤ s ≤ t

), t ≥ 0 , (7.19)

augmented by including all null sets. Their associated predictable quadratic variations are

〈Mn,1〉(t) =λnt

n,

〈Mn,2〉(t) =µ

n

∫ t

0(Qn(s) ∧ n) ds,

〈Mn,3〉(t) =θ

n

∫ t

0(Qn(s)− n)+ ds, t ≥ 0 , (7.20)

44

where E[〈Mn,i〉(t)] < ∞ for all i, t ≥ 0 and n ≥ 1.

The representation in (7.18) satisfies the Lipschitz condition in Theorem 4.1, so that theintegral representation is again a continuous mapping. In particular, here we have

Xn(t) ≡ Xn(0)+Mn,1(t)−Mn,2(t)−Mn,3(t)+(λn − µn)t√

n+

∫ t

0h(Xn(s)) ds , t ≥ 0 , (7.21)

where h : R→ R is the function

h(s) = −µ(s ∧ 0)− θ(s)+, s ∈ R , (7.22)

so that h is Lipschitz as required for Theorem 4.1:

|h(s1)− h(s2)| ≤ (µ ∨ θ)|s1 − s2| for all s1, s2 ∈ R . (7.23)

Hence the proof can proceed exactly as before. Note that we have convergence

(λn − µn)√n

→ −µβ as n →∞ (7.24)

for the deterministic term in (7.21) by virtue of the QED scaling assumption in (7.1).The analog of Theorem 4.2 is the corresponding FCLT for three independent rate-1 Poisson

processes, now including R as well as A and S. We now have three random-time-changeprocesses: ΦA,n, ΦS,n and ΦR,n, which here take the form:

ΦA,n(t) ≡ 〈Mn,1〉(t) =λnt

n,

ΦS,n(t) ≡ 〈Mn,2〉(t) =µ

n

∫ t

0(Q(s) ∧ n) ds,

ΦR,n(t) ≡ 〈Mn,3〉(t) =θ

n

∫ t

0(Q(s)− n)+ ds , (7.25)

drawing upon (7.20). By the same line of reasoning as before, we obtain the deterministiclimits

ΦA,n(t) ⇒ µe,

ΦS,n(t) ⇒ µe

ΦR,n(t) ⇒ η , (7.26)

where, as before, e is the identity map e(t) ≡ t, t ≥ 0, and η is the zero function η(t) = 0,t ≥ 0. In particular, we again get the sequence Xn : n ≥ 1 stochastically bounded in D by thereasoning in §§5.2 and 6.2. Then, by Lemma 5.10 and §6.1, we get the FWLLN correspondingto Lemma 4.3. Finally, from Lemma 4.3, we can prove (7.26), just as we proved Lemma 4.2,using analogs of the continuous map in (4.19) for the random time changes in (7.25). The newfunctions for applications of the CMT are

h1(x)(t) ≡ µ

∫ t

0(x(s) ∧ 1) ds and h2(x)(t) ≡ θ

∫ t

0(x(s)− 1)+ ds (7.27)

for t ≥ 0.Paralleling (4.21), here we have

(MA,n,ΦA,n,MS,n,ΦS,n,MR,n, ΦR,n) ⇒ (B1, µe, B2, µe,B3, η) in D6 (7.28)

45

as n →∞. Hence we can apply the CMT with composition just as before. Paralleling (4.22),here we obtain first

(MA,n ΦA,n,MS,n ΦS,n,MR,n ΦR,n) ⇒ (B1 µe,B2 µe,B3 η) in D3 (7.29)

as n →∞, and then

(MA,n ΦA,n −MS,n ΦS,n −MR,n ΦR,n) ⇒ (B1 µe−B2 µe−B3 η) in D . (7.30)

However,B1 µe−B2 µe−B3 η

d= B1 µe−B2 µe− ηd=

√2µB . (7.31)

Finally, the CMT with the integral representation (7.18) and Theorem 4.1 completes the proof.Note that the limiting Brownian motion associated with R does not appear, because ΦR,n isasymptotically negligible. That is why the infinitesimal variance is the same as before.

7.2. Finite Waiting Rooms

We can also obtain stochastic-process limits for the number of customers in the systemin associated M/M/n/0 (Erlang-B), M/M/n/mn and M/M/n/mn + M models, which havefinite waiting rooms. For the Erlang-B model, there is no waiting room at all; for the othermodels there is a waiting room on size mn in model n, where mn is allowed to grow with n sothat

mn√n→ κ ≥ 0 as n →∞ , (7.32)

as in (1.9). The QED many-server heavy-traffic limit was stated as Theorem 1.2.The proof can be much the same as in §7.1. The idea is to introduce the finite waiting room

via a reflection map, as in §§3.5, 5.2, 13.5, 14.2, 14.3 and 14.8 of Whitt (2002), correspondingto an upper barrier at κ, but the reflection map here is more complicated than for single-serverqueues and networks of such queues, because it is not applied to a free process. We use anextension of Theorem 4.1 constructing a mapping from D×R into D2, taking model data intothe content function and the upper-barrier regulator function, which we now state withoutproof.

Theorem 7.3. (a continuous integral representation with reflection) Consider the modifiedintegral representation

x(t) = min κ, b + y(t) +∫ t

0h(x(s)) ds− u(t) , t ≥ 0 , (7.33)

where h : R → R satisfies h(0) = 0 and is a Lipschitz function as in (4.4), and u is anondecreasing nonnegative function in D such that (7.33) holds and

∫ ∞

01x(t−)<κ du(t) = 0 . (7.34)

The modified integral representation in (7.33) has a unique solution (x, u), so that it constitutesa bonafide function (f1, f2) : D×R→ D×D mapping (y, b) into x ≡ f1(y, b) and u ≡ f2(y, b).In addition, the function (f1, f2) is continuous provided that the product topology is used forproduct spaces and the function space D (in both the domain and range) is endowed with either:(i) the topology of uniform convergence over bounded intervals or (ii) the Skorohod J1 topology.Moreover, if y is continuous, then so are x and u.

46

Now that we understand how we are treating the finite waiting rooms, the QED many-server heavy-traffic limit theorem is as stated in Theorem 1.2. This modification alters thelimiting diffusion process in Theorem 7.1 only by the addition of a reflecting upper barrier atκ for the sequence of models with waiting rooms of size mn, where κ = 0 for the Erlang-Bmodel. When κ = 0, X is a reflected OU (ROU) process. Properties of the ROU process arecontained in Ward and Glynn (2003b). Proofs for the two cases κ > 0 and κ = 0 by othermethods are contained in §4.5 of Whitt (2005) and Theorem 4.1 of Srikant and Whitt (1996).General references on reflection maps are Lions and Sznitman (1984) and Dupuis and Ishii(1991).

Proof of Theorem 1.2. We briefly sketch the proof. Instead of (7.7), here we have repre-sentation

Qn(t) ≡ Qn(0) + A(λnt)−Dn(t)− Ln(t)− Un(t), t ≥ 0 ,

= Qn(0) + A(λnt)− S

(µ

∫ t

0(Qn(s) ∧ n) ds

)

−R

(θ

∫ t

0(Qn(s)− n)+ ds

)− Un(t) , (7.35)

for t ≥ 0, where Un(t) is the number of arrivals in the time interval [0, t] when the system isfull in model n, i.e., when Qn(t) = n + mn. In particular,

Un(t) ≡∫ t

01Qn(s−)=n+mn dA(λns), t ≥ 0 . (7.36)

To connect to Theorem 7.3, it is significant that Un can also be represented as the uniquenondecreasing nonnegative process such that Qn(t) ≤ n + mn, (7.35) holds and

∫ ∞

01Qn(t−)<κ dUn(t) = 0 . (7.37)

We now construct a martingale representation, just as in (7.8)–(7.17). The following is thenatural extension of Theorem 7.2:

Theorem 7.4. (first martingale representation for the scaled processes in the M/M/n/mn +M model) If κ < ∞, then the scaled processes have the martingale representation

Xn(t) ≡ min κ,Zn(t), where

Zn(t) ≡ Xn(0) + Mn,1(t)−Mn,2(t)−Mn,3(t) +(λn − µn)t√

n

−∫ t

0

[µ(Xn(s) ∧ 0) + θXn(s)+

]ds− Vn(t), t ≥ 0 , (7.38)

Mn,i are the scaled martingales in (7.17) and

Vn(t) ≡ Un(t)√n

, t ≥ 0 , (7.39)

for Un in (7.35)–(7.37). The scaled processes Mn,i are square-integrable martingales with respectto the filtrations Fn ≡ Fn,t : t ≥ 0 defined by

Fn,t ≡ σ

(Qn(0), A(λns), S

(µ

∫ s

0(Qn(u) ∧ n) du

),

R

(θ

∫ s

0(Qn(u)− n)+ du

): 0 ≤ s ≤ t

), t ≥ 0 , (7.40)

47

augmented by including all null sets. Their associated predictable quadratic variations are

〈Mn,1〉(t) =λnt

n,

〈Mn,2〉(t) =µ

n

∫ t

0(Qn(s) ∧ n) ds,

〈Mn,3〉(t) =θ

n

∫ t

0(Qn(s)− n)+ ds, t ≥ 0 , (7.41)


By combining Theorems 7.3 and 7.4, we obtain the joint convergence

(Xn, Vn) ⇒ (X,U) in D2 as n →∞ , (7.42)

for Xn and Vn in (7.38) and (7.39), where the vector (X, U) is characterized by (1.12) and(1.13). That implies Theorem 1.2 stated in §1.

Remark 7.1. (correction) The argument in this section follows Whitt (2005), but providesmore detail. We note that the upper-barrier regulator processes are incorrectly expressed informulas (5.2) and (5.8) of Whitt (2005).

7.3. General Non-Markovian Arrival Processes

In this section, following §5 of Whitt (2005), we show how to extend the many-serverheavy-traffic limit from M/M/n/mn+M models to G/M/n/mn+M models, where the arrivalprocesses are allowed to be general stochastic point processes satisfying a FCLT. They could berenewal processes (GI) or even more general arrival processes. The limit of the arrival-processFCLT need not have continuous sample paths. (As noted at the end of §4.3, this separateargument is not needed if we do not use martingales.)

Let An ≡ An(t) : t ≥ 0 be the arrival process in model n and let

An(t) ≡ An(t)− λnt√n

, t ≥ 0 , (7.43)

be the associated scaled arrival process. We assume that

An ⇒ A in D as n →∞ . (7.44)

We also assume that, conditional on the entire arrival process, model n evolves as the Marko-vian queueing process with i.i.d. exponential service times and i.i.d. exponential times untilabandonment.

Thus, instead of Theorem 7.4, we have

Theorem 7.5. (first martingale representation for the scaled processes in the G/M/n/mn+Mmodel) Consider the family of G/M/n/mn +M models defined above, evolving as a Markovianqueue conditional on the arrival process. If mn < ∞, then the scaled processes have themartingale representation

Xn(t) ≡ Xn(0) + An(t)−Mn,2(t)−Mn,3(t) +(λn − µn)t√

n

−∫ t

0

[µ(Xn(s) ∧ 0) + θXn(s)+

]ds− Vn(t), t ≥ 0 , (7.45)

48

where An is the scaled arrival process in (7.43), Mn,i are the scaled martingales in (7.17) and

Vn(t) ≡ Un(t)√n

, t ≥ 0 , (7.46)

for Un in (7.35) and (7.37). The scaled processes Mn,i are square-integrable martingales withrespect to the filtrations Fn ≡ Fn,t : t ≥ 0 defined by

Fn,t ≡ σ

(Qn(0), An(u) : u ≥ 0, S

(µ

∫ s

0(Qn(u) ∧ n) du

),

R

(θ

∫ s

0(Qn(u)− n)+ du

): 0 ≤ s ≤ t

), t ≥ 0 , (7.47)

augmented by including all null sets. The associated predictable quadratic variations of the twomartingales are

〈Mn,2〉(t) =µ

n

∫ t

0(Qn(s) ∧ n) ds,

〈Mn,3〉(t) =θ

n

∫ t

0(Qn(s)− n)+ ds, t ≥ 0 , (7.48)


Here is the corresponding theorem for the G/M/n/mn + M model.

Theorem 7.6. (heavy-traffic limit in D for the G/M/n/mn+M model) Consider the sequenceof G/M/n/mn + M models defined above, with the scaling in (7.43), (1.8) and (1.9). Let Xn

be as defined in (1.3). If

Xn(0) ⇒ X(0) in R and An ⇒ A in D as n →∞ , (7.49)


where the limit process X satisfies the stochastic integral equation

X(t) = X(0) + A(t)− βµt +√

µB(t)−∫ t

0[µ(X(s) ∧ 0) + θ(X(s) ∨ 0)] ds− U(t) , t ≥ 0 ,

(7.51)with B ≡ B(t) : t ≥ 0 being a standard Brownian motion and U being the unique nonde-creasing nonnegative process in D such that (7.51) holds and

∫ ∞

01X(t−)<κ dU(t) = 0 . (7.52)

Equivalently, X satisfies the SDE

dX(t) = dA(t)− µβ − µ(X(t) ∧ 0)dt− θ(X(t) ∨ 0)dt +√

µdB(t)− dU(t), t ≥ 0 , (7.53)

for U in (7.52).

49

Proof. We start with the FCLT for the arrival process assumed in (7.44) and (7.43). Wecondition on possible realizations of the arrival process: For each n ≥ 1, let ζn be a possiblerealization of the scaled arrival process An and let ζ be a possible realization of the limitprocess A. Let Xζn

n be the scaled process Xn conditional on An = ζn, and let Xζ be the limitprocess X conditional on A = ζ.

Since Xn and An are random elements of D, these quantities Xζnn and Xζ are well defined

via regular conditional probabilities; e.g., see §8 of Chapter V, pp 146-150, of Parthasarathy(1967). In particular, we can regard P (Xn ∈ ·|An = z) as a probability measure on D for eachz ∈ D and we can regard P (Xn ∈ B|An = z) as a measurable function of z in D for each Borelset B in D, where

P (Xn ∈ B) =∫

BP (Xn ∈ B|An = z) dP (An = z) .

And similarly for the pair (X, A).A minor modification of the previous proof of Theorem 1.2 establishes that

Xζnn ⇒ Xζ in D whenever ζn → ζ in D ; (7.54)

i.e., for each continuous bounded real-valued function f on D,

E[f(Xζnn )] → E[f(Xζ)] as n →∞ (7.55)

whenever ζn → ζ in D. Now let

hn(ζn) ≡ E[f(Xζnn )] and h(ζ) ≡ E[f(Xζ)] (7.56)

for that function f . Since we have regular conditional probabilities, we can regard the functionshn and h as measurable functions from D to R for each such f .

We are now ready to apply the generalized continuous mapping theorem, Theorem 3.4.4of Whitt (2002). Since hn and h are measurable functions such that hn(ζn) → h(ζ) wheneverζn → ζ and since An ⇒ A in D, we have

hn(An) ⇒ h(A) as n →∞ . (7.57)

Since the function f used in (7.55) and (7.56) is bounded, these random variables are bounded.Hence convergence in distribution implies convergence of moments. Hence we have, for thatfunction f ,

E[f(Xn)] ≡ E[hn(An)] → E[h(A)] ≡ E[f(X)] as n →∞ . (7.58)

Since the convergence in (7.58) holds for all continuous bounded real-valued functions f on D,we have shown that Xn ⇒ X, as claimed.

8. The Martingale FCLT

We now turn to the martingale FCLT. For our queueing stochastic-process limits, it is ofinterest because it provides one way to prove the FCLT for a Poisson process in Theorem 4.2and because we can base our entire proof of Theorem 1.1 on the martingale FCLT. However,the gain in the proof of Theorem 1.1 is not so great, because we still need to establish the sametechnical supporting lemmas in §6. Thus we primarily focus on the martingale FCLT for itsown sake. Its proof may provide guidance for treating more complicated queueing models.

50

8.1. The Statement

We now state a version of the martingale FCLT for a sequence of local martingales Mn : n ≥ 1in Dk, based on Theorem 7.1 on p. 339 of Ethier and Kurtz (1986), hereafter referred to asEK. We shall also make frequent reference to Jacod and Shiryayev (1987), hereafter referred toas JS. See Section VIII.3 of JS for related results; see other sections of JS for generalizations.

We will state a special case of Theorem 7.1 of Ethier and Kurtz (1986) in which the limitprocess is multi-dimensional Brownian motion. However, the framework always produces limitswith continuous sample paths and independent Gaussian increments. Most applications involveconvergence to Brownian motion. Other situations are covered by JS, from which we see thatproving convergence to discontinuous processes evidently is more complicated.

The key part of each condition below is the convergence of the quadratic covariation pro-cesses. Condition (i) involves to optional quadratic-covariation (square-bracket) processes[Mn,i, Mn,j ], while condition (ii) involves the predictable quadratic-covariation (angle-bracket)processes 〈Mn,i,Mn,j〉. Recall from §3.2 that the square-bracket process is more general, beingwell defined for any local martingale (and thus any martingale), whereas the associated angle-bracket process is well defined only for any locally square-integrable martingale (and thus anysquare-integrable martingale).

Thus the key conditions below are the assumed convergence of the quadratic-variationprocesses in conditions (8.2) and (8.5). The other conditions (8.1), (8.3) and (8.5) are technicalregularity conditions. There is some variation in the literature concerning the extra technicalregularity conditions; e.g., see Rebolledo (1980) and JS. Recall that, for a function x in D,J(x, T ) is the absolute value of the maximum jump in x over the interval [0, T ]; see (5.9).

Theorem 8.1. (multidimensional martingale FCLT) For n ≥ 1, let Mn ≡ (Mn,1, . . . , Mn,k)be a local martingale in Dk with respect to a filtration Fn ≡ Fn,t : t ≥ 0 satisfying Mn(0) =(0, . . . , 0). Let C ≡ (ci,j) be a k × k covariance matrix, i.e., a nonnegative-definite symmetricmatrix of real numbers.

Assume one of the following two conditions hold:

(i) The expected value of the maximum jump in Mn is asymptotically negligible; i.e., foreach T > 0,

limn→∞ E [J(Mn, T )] = 0 (8.1)

and, for each pair (i, j) with 1 ≤ i ≤ k and 1 ≤ j ≤ k, and each t > 0,

[Mn,i,Mn,j ] (t) ⇒ ci,jt in R as n →∞ . (8.2)

(ii) In addition, the local martingale Mn is locally square-integrable, so that the predictablequadratic-covariation processes 〈Mn,i,Mn,j〉 can be defined. The expected value of the maximumjump in 〈Mn,i,Mn,j〉 and the maximum squared jump of Mn are asymptotically negligible; i.e.,for each T > 0 and (i, j) with 1 ≤ i ≤ k and 1 ≤ j ≤ k,

limn→∞ E [J (〈Mn,i,Mn,j〉, T )] = 0 , (8.3)

limn→∞

E

[J (Mn, T )2

]= 0 , (8.4)

and〈Mn,i,Mn,j〉(t) ⇒ ci,jt in R as n →∞ (8.5)

for each t > 0 and for each (i, j).

51

Conclusion:

If indeed one of the the conditions (i) or (ii) above hold, then

Mn ⇒ M in Dk as n →∞ , (8.6)

where M is a k-dimensional (0, C)-Brownian motion, having mean vector and covariancematrix

E[M(t)] = (0, . . . , 0) and E[M(t)M(t)tr] = Ct, t ≥ 0 , (8.7)

where, for a matrix A, Atr is the transpose.

Of course, a common simple case arises when C is a diagonal matrix; then the k componentmarginal one-dimensional Brownian motions are independent. When C = I, the identitymatrix, M is a standard k-dimensional Brownian motion, with independent one-dimensionalstandard Brownian motions as marginals.

At a high level, Theorem 8.1 says that, under regularity conditions, convergence of martin-gales in D is implied by convergence of the associated quadratic covariation processes. At firstglance, the result seems even stronger, because we need convergence of only the one-dimensionalquadratic covariation processes for a single time argument. However, that is misleading, be-cause the stronger weak convergence of these quadratic covariation processes in Dk2

is actuallyequivalent to the weaker required convergence in R for each t, i, j in conditions (8.2) and (8.5),as we will show below.

To state the result, let [[Mn]] be the matrix-valued random element of Dk2with (i, j)th

component [Mn,i,Mn,j ]; and let 〈〈Mn〉〉 be the matrix-valued random element of Dk2with

(i, j)th component 〈Mn,i,Mn,j〉. Let e be the identity map, so that Ce is the matrix-valueddeterministic function in Dk2

with elements ci,jt : t ≥ 0.

Lemma 8.1. (modes of convergence for quadratic covariation processes) Let [[Mn]] and 〈〈Mn〉〉be the matrix-valued quadratic-covariation processes defined above; let Ce be the matrix-valueddeterministic limit defined above. Then condition (8.2) is equivalent to

[[Mn]] ⇒ Ce in Dk2, (8.8)

while condition (8.5) is equivalent to

〈〈Mn〉〉 ⇒ Ce in Dk2, (8.9)

Lemma 8.1 is important, not only for general understanding, but because it plays animportant role in the proof of Theorem 8.1. (See the discussion after (8.12).)

Proof of Lemma 8.1. We exploit the fact that the ordinary quadratic variation processesare nondecreasing, but we need to do more to treat the quadratic covariation processes wheni 6= j. Since [Mn,i,Mn,i] and 〈Mn,i,Mn,i〉 for each i are nondecreasing and ci,it : t ≥ 0is continuous, we can apply §VI.2.b and Theorem VI.3.37 of JS to get convergence of theseone-dimensional quadratic-variation processes in D for each i from the corresponding limits inR for each t. Before applying Theorem VI.3.37 of JS, we note that we have convergence of thefinite-dimensional distributions, because we can apply Theorem 11.4.5 of Whitt (2002). Wethen use the representations

2[Mn,i,Mn,j ] = [Mn,i + Mn,j ,Mn,i + Mn,j ]− [Mn,i,Mn,i]− [Mn,j ,Mn,j ]2〈Mn,i,Mn,j〉 = 〈Mn,i + Mn,j ,Mn,i + Mn,j〉 − 〈Mn,i,Mn,i〉 − 〈Mn,j ,Mn,j〉 , (8.10)

52

e.g., see §1.8 of Liptser and Shiryaev (1986). First, we obtain the limits

[Mn,i + Mn,j ,Mn,i + Mn,j ](t) ⇒ 2ci,jt + ci,it + cj,jt in R

and〈Mn,i + Mn,j ,Mn,i + Mn,j〉(t) ⇒ 2ci,jt + ci,it + cj,jt in R

for each t and i 6= j from conditions (8.2) and (8.5). The limits in (8.2) and (8.5) for thecomponents extend to vectors and then we can apply the continuous mapping theorem withaddition. Since [Mn,i+Mn,j ,Mn,i+Mn,j ] and 〈Mn,i+Mn,j ,Mn,i+Mn,j〉 are both nondecreasingprocesses, we can repeat the argument above for [Mn,i,Mn,i] and 〈Mn,i,Mn,i〉 to get

[Mn,i + Mn,j ,Mn,i + Mn,j ] ⇒ (2ci,j + ci,i + cj,j)e in D

and〈Mn,i + Mn,j ,Mn,i + Mn,j〉 ⇒ (2ci,j + ci,i + cj,j)e in D

We then get the corresponding limits in D3 for the vector processes, and apply the continuousmapping theorem with addition again to get

[Mn,i,Mn,j ] ⇒ ci,je and 〈Mn,i,Mn,j〉 ⇒ ci,je in D

for all i and j. Since these limits extend to vectors, we finally have derived the claimed limitsin (8.8) and (8.9).

8.2. Proof of the Martingale FCLT

Outline of the Proof. The broad outline of the proof of Theorem 8.1 is standard. As inboth EK and JS, the proof is an application of Corollary 5.3: We first show that the sequenceMn : n ≥ 1 is tight in Dk, which implies relative compactness by Theorem 5.1. Havingestablished relative compactness, we show that the limit of any convergent subsequence mustbe k-dimensional Brownian motion with covariance matrix C. That is, there is a tightness stepand there is a characterization step. Both EK and JS in their introductory remarks, emphasizethe importance of the characterization step.

Before going into details, we indicate where the key results are in JS. Case (i) in Theorem8.1 is covered by Theorem VIII.3.12 on p 432 of JS. Condition 3.14 on top of p 433 in JS isimplied by condition (8.1). Condition (b.ii) there in JS is condition (8.2). The counterexamplein Remark 3.19 on p 434 of JS shows that weakening condition 3.14 there can cause problems.

Case (ii) in Theorem 8.1 is covered by Theorem VIII.3.22 on p 435 of JS. Condition 3.23 onp 435 of JS is implied by condition (8.4). There ν is the predictable random measure, whichis part of the characteristics of a semimartingale, as defined in §II.2 of JS and ∗ denotes theoperator in 1.5 on p 66 of JS constructing the associated integral process. (We will not usethe notions ν and ∗ here.)

8.2.1. Quick Proofs of C-Tightness

We can give very quick proofs of C-tightness in both cases, exploiting Theorems 5.6 and 5.7. Incase (i) we need to add an extra assumption. We need to assume that the jumps are uniformlybounded and that the bound is asymptotically negligible. In particular, we need to go beyondconditions (8.1) and assume that, for all T > 0, there exists n0 such that

J(Mn,i, T ) ≤ bn for all i and n ≥ n0 , (8.11)

53

where J is the maximum-jump function in (5.9) and bn are real numbers such that bn → 0as n → ∞, as in Theorem 5.7. But note that this extra condition is always satisfied for ourqueueing applications, because the jumps are at most of size 1 before scaling, so that theybecome at most of size 1/

√n after scaling.

Proof of C-Tightness in Case (i) with bounded jumps. Suppose that the jumps of themartingale are uniformly bounded and that the bound is asymptotically negligible, as in (8.11).We thus can apply Theorem 5.7. By condition 8.2 and Lemma 8.1, we have convergence ofthe optional quadratic-variation (square-bracket) processes: [Mn,i,Mn,j ] ⇒ ci,je in D for eachi and j. Thus these sequences of processes are necessarily C-tight. Hence Theorem 5.7 impliesthat the sequence of martingales Mn,i : n ≥ 1 is C-tight in D for each i. Lemma 5.2 thenimplies that Mn : n ≥ 1 is C-tight in Dk.

Proof of Tightness in Case (ii). We need no extra condition in case (ii). We can applyTheorem 5.6. Condition (8.5) and Lemma 8.1 imply that there is convergence of the predictablequadratic-variation (angle-bracket) processes: 〈Mn,i,Mn,j〉 ⇒ ci,je in D for each i and j. Thusthese sequences of processes are necessarily C-tight, so condition (ii) of Theorem 5.6 is satisfied.Condition (i) is satisfied too, since Mn,i(0) = 0 for all i by assumption.

8.2.2. The Ethier-Kurtz Proof of Tightness in Case (i).

We now give a longer proof for case (i) without making the extra assumption in (8.11). Wefollow the proof on pp 341-343 of EK, elaborating on several points.

First, if the stochastic processes Mn are local martingales rather than martingales, thenwe introduce the available stopping times τn with τn ↑ ∞ w.p.1 so that Mn(τn ∧ t) : t ≥ 0are martingales for each n. Hence we simply assume that the processes Mn are martingales.

We then exploit the assumed limit [Mn,i,Mn,j ] (t) ⇒ ci,jt in R as n →∞ in (8.2) in orderto act as if [Mn,i,Mn,j ] (t) is bounded. In particular, as in (1.23) on p. 341 of EK, we introducethe stopping times

ηn ≡ inf t ≥ 0 : [Mn,i, Mn,i] (t) > ci,it + 1 for some i . (8.12)

As asserted in EK, by (8.2), ηn ⇒ ∞, but we provide additional detail about the supportingargument: If we only had the convergence in distribution for each t, it would not follow thatηn ⇒ ∞. For this step, it is critical that we can strengthen the mode of convergence toconvergence in distribution in D, where the topology on the underlying space corresponds touniform convergence over bounded intervals. To do so, we use the fact that the quadraticvariation processes [Mn,i,Mn,i] are monotone and the limit is continuous. In particular, weapply Lemma 8.1. But, indeed, ηn ⇒∞ as claimed.

Hence, it suffices to focus on the martingales Mn ≡ Mn(ηn ∧ t) : t ≥ 0. We then reducethe k initial dimensions to 1 by considering the martingales

Yn ≡k∑

i=1

θiMn,i (8.13)

for an arbitrary non-null vector θ ≡ (θ1, . . . , θk). The associated optional quadratic variationprocess is

An,θ(t) ≡ [Yn] =k∑

i=1

k∑

j=1

θiθj

[Mn,i, Mn,j

](t), t ≥ 0 . (8.14)

54

From condition (8.2) and Lemma 8.1, An,θ ⇒ cθe in D, where e is the identity function and

cθ ≡k∑

i=1

k∑

j=1

θiθjci,j . (8.15)

Some additional commentary may help here. The topology on the space D([0,∞),Rk)where the functions take values in Rk is strictly stronger than the topology on the productspace D([0,∞),R)k, but there is no difference on the subset of continuous functions, and sothis issue really plays no role here. We can apply Lemma 5.12 and condition (8.1) to concludethat a limit of any convergent subsequence must have continuous paths. In their use of Yn

defined in (8.13), EK work to establish the stronger tightness in D([0,∞),Rk); sufficiency iscovered by problem 3.22 on p. 153 of EK. On the other hand, for the product topology itwould suffice to apply Lemma 5.2, but the form of Yn in (8.13) covers that as well. The formof Yn in (8.13) is also convenient for characterizing the limiting process, as we will see.

The idea now is to establish the tightness of the sequence of martingales Yn : n ≥ 1 in(8.13). The approach in EK is to introduce smooth bounded functions f : R→ R and considerthe associated sequence f(Yn(t)) : n ≥ 1. This step underlies the entire EK book, and so isto be expected. To quickly see the fundamental role of this step in EK, see Chapters 1 and 4in EK, e.g., the definition of a full generator in (1.5.5) on p 24 and the associated martingaleproperty in Proposition 4.1.7 on p 162. This step is closely related to Ito’s formula, as weexplain in §8.2.5 below. It is naturally associated with the Markov-process-centric approachin EK and Stroock and Varadhan (1979), as opposed to the martingale-centric approach inAldous (1978, 1989), Rebolledo (1980) and JS.

Consistent with that general strategy, Ethier and Kurtz exploit tightness criteria involvingsuch smooth bounded functions; e.g., see Theorem 5.3 and Lemma 5.13 here. Having intro-duced those smooth bounded functions, we establish tightness by applying Lemma 5.13 (ii.a).We will exploit the bounds provided by (8.12). We let the filtrations Fn ≡ Fn,t : t ≥ 0 bethe internal filtrations of Yn. First, the submartingale-maximum inequality in Lemma 5.7 willbe used to establish the required stochastic boundedness; see (1.34) on p. 343 of EK. It willthen remain to establish the moment bounds in (5.16).

To carry out both these steps, we fix a function f in C∞c , the set of infinitely differentiablefunctions f : R → R with compact support (so that the function f and all derivatives arebounded). The strategy is to apply Taylor’s theorem to establish the desired bounds. Withthat in mind, as in (1.27) on p. 341 of EK, we write

E [f(Yn(t + s))− f(Yn(t)) | Fn,t]

= E

[m−1∑

i=0

(f(Yn(ti+1))− f(Yn(ti))− f ′(Yn(ti))ξn,i

) | Fn,t

], (8.16)

where 0 ≤ t = t0 < t1 < · · · < tm = t+s and ξn,i = Yn(ti+1)−Yn(ti). Formula (8.16) is justifiedbecause E[ξn,i|Fn,t] = 0 for all i since Yn is an Fn-martingale. Also, there is cancellation inthe first two terms in the sum.

Following (1.28)–(1.30) in EK, we write

γn ≡ max j : tj < ηn ∧ (t + s) (8.17)

and

ζn ≡ max j : tj < ηn ∧ (t + s),j∑

i=0

ξ2n,i ≤ k

k∑

i=1

ci,iθ2i (t + s) + 2k

k∑

i=1

θ2i . (8.18)

55

By the definition of ηn, we will have γn = ζn when there are sufficiently many points so thatthe largest interval ti+1 − ti is suitably small and n is sufficiently large. Some additionalexplanation might help here: To get the upper bound in (8.18), we exploit (3.7) and the simplerelation (a + b)2 ≥ 0, implying that |2ab| ≤ a2 + b2 for real numbers a and b. Thus

2|[Mn,i,Mn,j ](t)θiθj | ≤ [Mn,i,Mn,i](t)θ2i + [Mn,j ,Mn,j ](t)θ2

j (8.19)

for each i and j, so that

|An,θ(t)| = |k∑

i=1

k∑

j=1

θiθj

[Mn,i, Mn,j

](t)| ≤ k

k∑

i=1

[Mn,i, Mn,i

](t)θ2

i . (8.20)

In turn, by (8.12), inequality (8.20) implies that

|An,θ(t)| ≤ kk∑

i=1

(ci,it + 1)θ2i = k

k∑

i=1

ci,itθ2i + k

k∑

i=1

θ2i . (8.21)

In definition (8.18) we increase the target by replacing the constant 1 by 2 in the last term, sothat eventually we will have ζn = γn, as claimed.

As in (1.30) on p. 341 of EK, we next extend the expression (8.16) to include secondderivative terms, simply by adding and subtracting. In particular, we have

E [f(Yn(t + s))− f(Yn(t)) | Fn,t]

= E

γn∑

i=ζn

(f(Yn(ti+1))− f(Yn(ti))− f ′(Yn(ti))ξn,i

) | Fn,t

+E

[ζn−1∑

i=0

(f(Yn(ti+1))− f(Yn(ti))− f ′(Yn(ti))ξn,i − 1

2f ′′(Yn(ti))ξ2

n,i

)| Fn,t

]

+E

[ζn−1∑

i=0

12f ′′(Yn(ti))ξ2

n,i | Fn,t

]. (8.22)

Following p. 342 of EK, we set ∆Yn(u) ≡ Yn(u)− Yn(u−) and introduce more time points sothat the largest difference ti+1 − ti converges to 0. In the limit as we add more time points,we obtain the representation

E [f(Yn(t + s))− f(Yn(t)) | Fn,t] = Wn,1(t, t + s) + Wn,2(t, t + s) + Wn,3(t, t + s) , (8.23)

where

Wn,1(t, t + s) ≡ E [f(Yn((t + s) ∧ ηn))− f(Yn((t + s) ∧ ηn−))− f ′(Yn((t + s) ∧ ηn−))∆Yn((t + s) ∧ ηn)) | Fn,t

]

Wn,2(t, t + s) ≡ E

∑

t∧ηn<u<(t+s)∧ηn

(f(Yn(u))− f(Yn(u−))− f ′(Yn(u−))∆Yn(u)

− 12f ′′(Yn(u−))(∆Yn(u))2

)| Fn,t

]

Wn,3(t, t + s) ≡ E

[∫ (t+s)∧ηn

t∧ηn

12f ′′(Yn(u−)) dAn,θ(u) | Fn,t

]. (8.24)

56

(We remark that we have inserted t∧ ηn in two places in (1.31) on p 342 of EK. We can writethe sum for Wn,2 that way because Yn has at most countably many discontinuities.)

Now we can bound the three terms in (8.23) and (8.24). As in (1.32) on p. 342 of EK, wecan apply Taylor’s theorem in the form

f(b) = f(a) + f (1)(a)(b− a) + · · ·+ f (k−1)(a)(b− a)k−1

(k − 1)!+ f (k)(c)

(b− a)k

k!(8.25)

for some point c with a < c < b for k ≥ 1 (using modified derivative notation).Applying this to the second conditional-expectation term in (8.23) and the definition of ηn

in (8.12), we get

Wn,2(t, t + s) ≤ ‖f ′′′‖6

E

[sup

t<u<t+s|∆Yn(u)|k

k∑

i=1

(ci,i(t + s) + 1)θ2i | Fn,t

]. (8.26)

Reasoning the same way for the other two terms, overall we get a minor variation of thebound in (1.33) on p. 342 of EK, namely,

E [f(Yn(t + s))− f(Yn(t)) | Fn,t]

≤ CfE

[sup

t<u≤t+s|∆Yn(u)|

(1 + k

k∑

i=1

(ci,i(t + s) + 1)θ2i

)

+An,θ((t + s) ∧ ηn−)−An,θ(t ∧ ηn−) | Fn,t] , (8.27)

where the constant Cf depends only on the norms of the first three derivatives: ‖f ′‖, ‖f ′′‖and ‖f ′′′‖.

We now apply the inequality in (8.27) to construct the required bounds. First, for thestochastic boundedness needed in condition (i) of Lemma 5.13, we let the function f havethe additional properties in Lemma 5.7; i.e., we assume that f is an even nonnegative convexfunction with f ′(t) > 0 for t > 0. Then there exists a constant K such that

E[f(Yn(T ))] ≤ CfE

[(1 + J(Yn, T )|)

(1 + k

k∑

i=1

(ci,i(T ) + 1)θ2i

)]< K (8.28)

for all n by virtue of (8.1), again using J in (5.9)). (See (1.34) on p 343 of EK, where t on theright should be T .)

Next, we can let the random variables Zn(δ, f, T ) needed in condition (ii.a) of Lemma 5.13be defined, as in (1.36) on p. 343 of EK, by

Zn(δ, f, T ) ≡ Cf

[J(Yn, T + δ)

(1 + k

k∑

i=1

(ci,i(t + s) + 1)θ2i

)

+ sup0≤t≤T

An,θ((t + δ) ∧ ηn−)−An,θ(t ∧ ηn−)]

, (8.29)

where Cf is a new constant depending on the function f . Then, by (8.1) and (8.2),

limδ↓0

lim supn→∞

E[Zn(δ, f, T )] = 0 , (8.30)

as required in condition (ii.a) of Lemma 5.13. That completes the proof of tightness in case(i).

57

8.2.3. The Ethier-Kurtz Proof of Tightness in Case (ii).

We already gave a complete proof of tightness for case (ii) in §8.2.1. We now give an alternativeproof, following EK. The proof starts out the same as the proof for case (i). First, if thestochastic processes M2

n,i,i − 〈Mn,i,Mn,i〉 are local martingales rather than martingales, thenwe introduce the available stopping times τn with τn ↑ ∞ w.p.1 so that M2

n,i,i(τn ∧ t) −〈Mn,i,Mn,i〉(τn ∧ t) : t ≥ 0 are martingales for each n and i. Hence, we can assume thatM2

n,i,i − 〈Mn,i,Mn,i〉 are martingales.We then exploit the assumed limit 〈Mn,i,Mn,j〉(t) ⇒ ci,jt in R as n →∞ in (8.5) in order

to act as if 〈Mn,i, Mn,j〉(t) is bounded. In particular, as in (8.12) and (1.23) on p. 341 of EK,we introduce the additional stopping times

ηn ≡ inf t ≥ 0 : 〈Mn,i,Mn,i〉(t) > ci,it + 1 for some i . (8.31)

We then defineMn(t) ≡ Mn(ηn ∧ t), t ≥ 0 . (8.32)

Again we apply Lemma 8.1 to deduce that we have convergence 〈Mn,i,Mn,i〉 ⇒ ci,ie in D foreach i, which implies that ηn ⇒∞ as n →∞.

Closely paralleling EK, simplify notation by writing

An,i,j(t) ≡ 〈Mn,i,Mn,j〉(t) and An,i,j(t) ≡ An,i,j(t ∧ ηn), t ≥ 0 . (8.33)

ThenAn,i,i(t) ≤ 1 + ci,it + J(An,i,i, t), t ≥ 0 , (8.34)

for J in (5.9), where

E[J(An,i,i, t)] → 0 as n →∞ for all t > 0 , (8.35)

by condition (8.3). Since M2n,i,i− An,i,i is a martingale, so is

∑ki=1(M

2n,i,i− An,i,i〉). Hence, for

0 ≤ s < t,

k∑

i=1

E[Mn,i(t + s)2 − Mn,i(t)2|Fn,t

]=

k∑

i=1

E[An,i(t + s)− An,i(t)|Fn,t

](8.36)

On the other hand,

k∑

i=1

E

[(Mn,i(t + s)− Mn,i(t)

)2|Fn,t

]

=k∑

i=1

E[(

Mn,i(t + s)2 − 2Mn,i(t + s)Mn,i(t) + Mn,i(t)2)|Fn,t

]

=k∑

i=1

E[(

Mn,i(t + s)2 − Mn,i(t)2)|Fn,t

]. (8.37)

Hence,

E

[k∑

i=1

(Mn,i(t + s)− Mn,i(t)

)2|Fn,t

]= E

[k∑

i=1

An,i(t + s)− An,i(t)|Fn,t

], (8.38)

as in 1.41 on p 344 of EK.

58

Now we can verify condition (ii.b) in Lemma 5.13: We can define the random variablesZn(δ, T ) by

Zn(δ, T ) ≡ sup0≤t≤T

k∑

i=1

(An,i,i(t + δ)− An,i,i(t)) . (8.39)

By condition (8.5) and Lemma 8.1,

Zn(δ, T ) ⇒ sup0≤t≤T

k∑

i=1

(ci,i(t + δ)− ci,i(t)) = δk∑

i=1

ci,i as n →∞ . (8.40)

We then have uniform integrability because, by (8.34) and (8.39),

Zn(δ, T ) ≤k∑

i=1

(1 + ci,i(T + δ) + J(An,i,i, T )) , (8.41)

where J is the maximum jump function in (5.9) and E [J(An,i,i, T )] → 0 as n →∞. Hence,

limδ↓0

lim supn→∞

E[Zn(δ, T )] = 0 . (8.42)

Finally, we must verify condition (i) Lemma 5.13; i.e., we must show that Mn,i : n ≥ 1 isstochastically bounded. As for case (i), we can apply Lemma 5.7 for this purpose, here usingthe function f(x) ≡ x2. Modifying (8.38), we have, for all T > 0, the existence of a constantK such that

E

[k∑

i=1

Mn,i(T )2]

= E

[k∑

i=1

An,i(T )

]≤ K (8.43)

for all sufficiently large n. That completes the proof.

8.2.4. Characterization of the Limit with Uniformly Bounded Jumps.

Since the sequences of martingales Yn : n ≥ 1 and Mn : n ≥ 1 are tight, they are relativelycompact by Prohorov’s theorem (Theorem 5.1). We consider a converging subsequence Mnk

⇒L. It remains to show that L necessarily is the claimed k-dimensional (0, C)-Brownian motionM . By Corollary 5.3, that will imply that Mn ⇒ M . Since ηn ⇒ ∞, it then will follow thatMn ⇒ M as well. But in this section we will be working with the martingales Mnk

, for whichconvergence has been established by the previous tightness proofs.

For this characterization step, we first present a proof based upon JS, which requires thatthe jumps be uniformly bounded as an extra condition. We consider the proof in EK afterwardsin the next section, §8.2.5.

It seems useful to state the desired characterization result as a theorem. Conditions (i) and(ii) below cover the two cases of Theorem 8.1 under the extra condition (8.44).

Theorem 8.2. (characterization of the limit) Suppose that Mn ⇒ M in Dk, where Mn ≡(Mn,1, . . . , Mn,k) is a k-dimensional local martingale adapted to the filtration Fn ≡ Fn,t foreach n with bounded jumps, i.e., for all T > 0, there exists n0 and K such that

J(Mn,i, T ) ≤ K for all n ≥ n0 , (8.44)

where J is the maximum-jump function in (5.9). Suppose that M ≡ (M1, . . . , Mk) is a contin-uous k-dimensional process with M(0) = (0, . . . , 0). In addition, suppose that either

(i) [Mn,i,Mn,j ](t) ⇒ ci,jt as n →∞ in R (8.45)

59

for all t > 0, i and j, or (ii) Mn is locally square integrable and

〈Mn,i,Mn,j〉(t) ⇒ ci,jt as n →∞ in R (8.46)

for all t > 0, i and j. Then M is a k-dimensional (0, C)-Brownian motion, having time-dependent mean vector and covariance matrix

E[M(t)] = (0, . . . , 0) and E[M(t)M(t)tr] = Ct, t ≥ 0 . (8.47)

Theorem 8.2 follows directly from several other basic results. The first is the classicalLevy martingale characterization of Brownian motion; e.g., see p. 156 of Karatzas and Shreve(1988) or p. 102 of JS. Ito’s formula provides an elegant proof. (Theorem 7.1.2 on p 339 ofEK is a form that exploits Ito’s formula in the statement.) Recall that for a continuous localmartingale both the angle-bracket and square-bracket processes are well defined and equal.

Theorem 8.3. (Levy martingale characterization of Brownian motion) Let M ≡ (M1, . . . , Mk)be a continuous k-dimensional process adapted to a filtration F ≡ Ft with Mi(0) = 0 for eachi. Suppose that each one-dimensional marginal process Mi is a continuous local-F-martingale.If either the optional covariation processes satisfy

[Mi,Mj ] = ci,jt, t ≥ 0 , (8.48)

for each i and j, or if Mi are locally square-integrable martingales, so that the predictablequadratic covariation processes are well defined, and they satisfy

〈Mi,Mj〉(t) = ci,jt, t ≥ 0 , (8.49)

for each i and j, where C ≡ (ci,j) is a nonnegative-definite symmetric real matrix, then M isa k-dimensional (0, C)-Brownian motion, i.e.,

E[M(t)] = (0, . . . , 0) and E[M(t)M(t)tr] = Ct, t ≥ 0 . (8.50)

We now present two basic preservation results. The first preservation result states that thelimit of any convergent sequence of martingales must itself be a martingale, under regularityconditions. The following is Problem 7 on p 362 of EK. The essential idea is conveyed bythe bounded case, which is treated in detail in Proposition IX.1.1 on p 481 of JS. JS then goon to show that the boundedness can be replaced by uniform integrability (UI) and then abounded-jump condition. (Note that this is UI of the sequence Mn(t) : n ≥ 1 over n forfixed t, as opposed to the customary UI over t for fixed n.)

Theorem 8.4. (preservation of the martingale property under uniform integrability) Supposethat (i) Xn is a random element of Dk and Mn is a random element of D for each n ≥ 1, (ii)Mn is a martingale with respect to the filtration generated by (Xn,Mn) for each n ≥ 1, and(iii) (Xn, Mn) ⇒ (X, M) in Dk+1 as n → ∞. If, in addition, Mn(t) : t ≥ 1 is uniformlyintegrable for each t > 0, then M is a martingale with respect to the filtration generated by(X, M) (and thus also with respect to the filtration generated by M).

Proof. Let Fn,t : t ≥ 0 and Ft : t ≥ 0 denote the filtrations generated by (Xn,Mn) and(X, M), respectively, on their underlying probability spaces. Let the probability space for thelimit (X,M) be (Ω,F , P ). It is difficult to relate these filtrations directly, so we do so indirectly.This involves a rather tricky measurability argument. We accomplish this goal by consideringthe stochastic processes (X, M) and (Xn,Mn) as maps from the underlying probability spaces

60

to the function space Dk+1 with its usual sigma-field generated by the coordinate projections,i.e., with the filtration Dk+1 ≡ Dk+1

t : t ≥ 0. As discussed in §VI.1.1 of JS and §11.5.3 ofWhitt (2002), the Borel σ-field on Dk+1 coincides with the σ-field on Dk+1 generated by thecoordinate projections. A Dk+1

t -measurable real-valued function f(x) defined on Dk+1 dependson the function x in Dk+1 only through its behavior in the initial time interval [0, t].

As in step (a) of the proof of Proposition IX.1.1 of JS, we consider the map (X, M) : Ω →Dk+1 mapping the underlying probability space into Dk+1 with the sigma-field Dk+1 generatedby the coordinate projections. By this step, we are effectively lifting the underlying probabilityspace to Dk+1, with (X, M)(t) obtained as the coordinate projection.

Let t1 and t2 with t1 < t2 be almost-sure continuity points of the limit process (X, M).Let f : Dk+1 → R be a bounded continuous Dk+1

t1-measurable real-valued function in that

setting. By this construction, not only is f(X, M) Ft1-measurable but f(Xn,Mn) is Fn,t1

measurable for each n ≥ 1 as well. (Note that f(Xn,Mn) is the composition of the maps(Xn,Mn) : Ωn → Dk+1 and f : Dk+1 → R.)

By the continuous-mapping theorem, we have first

(f(Xn, Mn),Mn(t2),Mn(t1)) ⇒ (f(X, M),M(t2),M(t1)) in R3 as n →∞

and then

f(Xn,Mn)(Mn(t2)−Mn(t1)) ⇒ f(X, M)(M(t2)−M(t1)) in R as n →∞ .

By the boundedness of f and the uniform integrability of Mn(ti), we then have

E [f(Xn,Mn)(Mn(t2)−Mn(t1))] → E [f(X, M)(M(t2)−M(t1))] as n →∞ .

Now we exploit the fact that f(Xn,Mn) is actually Fn,t1-measurable for each n. We are thusable to invoke the martingale property for each n and conclude that

E [f(Xn,Mn)(Mn(t2)−Mn(t1))] = 0 for all n .

Combining these last two relations, we have the relation

E [f(X,M)(M(t2)−M(t1))] = 0 . (8.51)

Now, by a monotone class argument (e.g., see p 496 of EK), we can show that the relation(8.51) remains true for all bounded Dk+1

t1-measurable real-valued functions f , which includes

the indicator function of an arbitrary measurable set A in Dk+1t1

. Hence,

E[1(X,M)∈A(M(t2)−M(t1))

]= 0 .

This in turn implies thatE [1B(M(t2)−M(t1))] = 0

for all B in Ft1 , which implies that

E [M(t2)−M(t1))|Ft1 ] = 0 ,

which is the desired martingale property for these special time points t1 and t2. We obtain theresult for arbitrary time points t1 and t2 by considering limits from the right.

A way to get the uniform-integrability regularity condition for martingales in Theorem 8.4is uniformly bounded jumps to have a local martingale with uniformly bounded jumps. Thefollowing is Corollary IX.1.19 to Proposition IX.1.17 on p 485 of JS.

61

Theorem 8.5. (preservation of the local-martingale property with bounded jumps) Supposethat conditions (i) − (iii) of Theorem 8.4 hold, except that Mn is only required to be a localmartingale for each n. If, in addition, for each T > 0, there exists a positive integer n0 and aconstant K such that

J(Mn, T ) ≤ K for all n ≥ n0 . (8.52)

then M is a local martingale with respect to the filtration generated by X (and thus also withrespect to the filtration generated by M).

The second preservation result states that, under regularity conditions, the optional quadraticvariation [M ] of a local martingale is a continuous function of the local martingale. The fol-lowing is Corollary VI.6.7 on p 342 of JS.

Theorem 8.6. (preservation of the optional quadratic variation) Suppose that Mn is a localmartingale for each n ≥ 1 and Mn ⇒ M in D. If, in addition, for each T > 0, there exists apositive integer n0 and a constant K such that

E [J(Mn, T )] ≤ K for all n ≥ n0 , (8.53)

then(Mn, [Mn]) ⇒ (M, [M ]) in D2 . (8.54)

Note that condition (8.53) is implied by condition (8.52). More importantly, condition(8.53) is implied by condition (8.1).

Proof of Theorem 8.2. From Theorem 8.3, we see that it suffices to focus on one coordinateat a time. To characterize the covariation processes, we can consider the weighted sums

Mn,θ ≡k∑

i=1

θiMn,i (8.55)

for arbitrary non-null vector θ ≡ (θ1, . . . , θk). First suppose that condition (i) of Theorem 8.2is satisfied. We get the covariations via

2[Mn,i,Mn,j ] = [Mn,i + Mn,j ,Mn,i + Mn,j ]− [Mn,i,Mn,i]− [Mn,j ,Mn,j ] . (8.56)

Henceforth fix θ. First, condition (i) of Theorem 8.2 implies that

[Mn,θ](t) ⇒ cθt as n →∞ in R for all t > 0 . (8.57)

Condition (8.44) implies condition (8.52) for Mn,θ . Hence, we can apply Theorems 8.5 and 8.6to deduce that the limit process Mθ ≡

∑ki=1 θiMi is a local martingale with optional quadratic

variation [Mθ] = cθe, where cθ ≡∑k

i=1

∑kj=1 θiθjci,j . Since this is true for all θ, we can apply

Theorem 8.3 and (8.56) to complete the proof.Now, instead, assume that condition (ii) of Theorem 8.2 is satisfied. First, by Theorem 3.1,

we can deduce that M2n,θ −〈Mn,θ〉 is a local martingale for each n and θ. Next, it follows from

the assumed convergence in (8.46), Lemma 8.1 and the continuous mapping theorem that

M2n,θ − 〈Mn,θ〉 ⇒ M2

θ − cθe in D as n →∞ . (8.58)

By Theorem 8.5 and condition (8.44), the stochastic process M2θ − cθe is a local martingale

for each vector θ, but that means that 〈Mθ〉(t) = cθt for each vector θ, which implies thepredictable quadratic covariation condition in Theorem 8.3. Paralleling (8.56), we use

2〈Mi,Mj〉 = 〈Mi + Mj ,Mi + Mj〉 − 〈Mi,Mi〉 − 〈Mj ,Mj〉 . (8.59)

Hence, we can apply Theorem 8.3 to complete the proof.

62

8.2.5. The Ethier-Kurtz Characterization Proof in Case (i).

The proof of the characterization step for case (i) on p 343-344 of EK is quite brief. Fromthe pointer to Problem 7 on p. 362 of EK, one might naturally think that we should be applyingthe result there, which corresponds to Theorem 8.4 here, and represent the limit process as alimit of an appropriate sequence of martingales. A helpful initial observation here is that itis not actually necessary for the converging processes to be martingales. Instead of applyingTheorem 8.4, we can follow the proof of Theorem 8.4 in order to obtain the desired martingaleproperty asymptotically.

Accordingly, we directly establish that the limit process is a martingale. In particular, weshow that the stochastic process

f(L(t))− cθ

2

∫ t

0f ′′(L(s−)) ds : t ≥ 0 , (8.60)

is a martingale, where L is the limit of the converging subsequence Ynk. In particular, we

will show that

E

[f(L(t + s))− f(L(t)− cθ

2

∫ t+s

tf ′′(L(u−)) du | Ft

]= 0 (8.61)

for 0 < t < t + s.Toward that end, EK present their (1.38) and (1.39) on p 343. The claim about convergence

in probability uniformly over compact subsets of D in (1.38) on p 343 of EK seems hard toverify, but it is not hard to prove the desired (1.39), exploiting stronger properties that holdin this particular situation. We now provide additional details about the proof of a variant of(1.39) on p. 343 of EK. The desired variant of that statement is (8.63) below.

Lemma 8.2. Under the conditions of Theorem 8.1 in case (i), if L is the limit of a convergentsubsequence Ynk

: n ≥ 1, then

∫ (t+s)∧ηnk

t∧ηnk

12f ′′(Ynk

(u−)) dAnk,θ(u) ⇒∫ t+s

t

12f ′′(L(u−)) cθ du (8.62)

in D([t, t + s],R) as nk →∞. As a consequence,

E

[|∫ (t+s)∧ηnk

t∧ηnk

12f ′′(Ynk

(u−)) dAnk,θ(u)−∫ t+s

t

12f ′′(L(u−)) cθ du |

]→ 0 , (8.63)

as in (1.39) on p 343 of EK.

Proof of Lemma 8.2. We start by proving (8.62). First, we use the fact that both An,θ andcθe are nondecreasing functions. We also exploit Lemma 8.1 in order to get Ank,θ ⇒ cθe inD. The topology is uniform convergence over bounded intervals, because cθe is a continuousfunction. Given that Ynk

⇒ L, we can apply Lemma 5.12 and condition (8.1) to deducethat L and thus f ′′(L(t)) actually have continuous paths, so the topology is again uniformconvergence over bounded intervals. Then the limit in (8.62) is elementary: First, for anyε > 0, P (‖Ynk

− L‖t+s > ε) → 0 and P (ηnk≤ t + s) → 0 as nk → ∞. As a consequence,

P (‖f ′′ Ynk− f ′′ L‖t+s > ε) → 0 as n →∞ too. Given that ‖f ′′ Ynk

− f ′′ L‖t+s ≤ ε and

63

ηnk≥ t + s,

∣∣∣∣∣∫ (t+s)∧ηnk

t∧ηnk

12f ′′(Ynk


t

12f ′′(L(u−)) cθ du

∣∣∣∣∣

≤∣∣∣∣∣∫ (t+s)

t

12f ′′(Ynk


t

12f ′′(L(u−)) dAnk,θ(u)

∣∣∣∣∣

+∣∣∣∣∫ t+s

t

12f ′′(L(u−)) dAnk,θ(u)−

∫ t+s

t

12f ′′(L(u−)) cθ du

∣∣∣∣

≤ ε(Ank,θ(t + s)−Ank,θ(t)) +∣∣∣∣∫ t+s

t|12f ′′(L(u−))| (dAnk,θ(u)− cθ du)

∣∣∣∣⇒ 0 as n →∞ . (8.64)

Hence we have the first limit (8.62). Since f ′′ and the length of the interval [t, t + s] are bothbounded, we also get the associated limit of expectations in (8.63).

Back to the Characterization Proof. We now apply the argument used in the proof ofTheorem 8.4. Let Fnk,t : t ≥ 0 and Ft : t ≥ 0 denote the filtrations generated by Ynk

and L, respectively, on their underlying probability spaces. Let the probability space for thelimit Y be (Ω,F , P ). As in the proof of Theorem 8.4, we consider a real-valued function fromthe function space D with its usual sigma-field generated by the coordinate projections, i.e.,with the filtration D ≡ Dt : t ≥ 0. A Dt-measurable function defined on D depends on thefunction x in D only through its behavior in the initial time interval [0, t].

Let t and t+s be the arbitrary time points with 0 ≤ t < t+s. Let h : D → R be a boundedcontinuous Dt-measurable real-valued function in that setting. By this construction, not onlyis h(L) Ft-measurable, but h(Ynk

) is Fnk,t measurable for each k ≥ 1 as well.By the continuous-mapping theorem, we have first

(h(Ynk

), f(Ynk(t)), f(Ynk

(t + s)),∫ (t+s)∧ηnk

t∧ηnk

12f ′′(Ynk

(u−)) dAnk,θ(u)

)

⇒(

h(L), f(L(t)), f(L(t + s)),∫ t+s

t

12f ′′(L(u−)) cθ du

)in R4 (8.65)

as n →∞, and then

h(Ynk)

(f(Ynk

(t + s)− f(Ynk(t))−

∫ (t+s)∧ηnk

t∧ηnk

12f ′′(Ynk

(u−)) dAnk,θ(u)

)

⇒ h(L)(

f(L(t + s))− f(L(t))−∫ t+s

t

12f ′′(L(u−)) cθ du

)in R (8.66)

as k →∞. On the other hand, as argued by EK, by condition (8.21) and the boundedness off and f ′′, the first two terms Wn,1(t + s) and Wn,2(t, t + s) on the right in (8.23) converge to0 in L1 as well. Hence, the limit in (8.66) must be 0. As a consequence, we have shown that

E

[h(L)

(f(L(t + s))− f(L(t))−

∫ t+s

t

12f ′′(L(u−)) cθ du

)]= 0 (8.67)

for all continuous bounded Dt-measurable real-valued functions h. By the approximationargument involving the monotone class theorem (as described in the proof of Theorem 8.4),

64

we thus get

E

[1B

(f(L(t + s))− f(L(t))−

∫ t+s

t

12f ′′(L(u−)) cθ du

)]= 0 (8.68)

for each measurable subset B in Ft. That implies (8.61), which in turn implies the desiredmartingale property. Finally, Theorem 7.1.2 on p 339 of EK (playing the role of Theorem 8.3here) shows that this martingale property for all these smooth functions f implies that thelimit L must be Brownian motion, as claimed.

8.2.6. The Ethier-Kurtz Characterization Proof in Case (ii).

The proof of characterization in this second case seems much easier for two reasons: First,the required argument is relatively straightforward (compared to the previous section) and,second, EK provides almost all the details.

We mostly just repeat the argument in EK. It suffices to focus on a limit L of a convergentsubsequence Mnk

: nk ≥ 1 of the relatively compact sequence Mn : n ≥ 1 in Dk. Bycondition (8.4) and Lemma 5.12, L almost surely has continuous paths. By the arguments inthe EK tightness proof in §8.2.3, starting with (8.36), we have

supn≥1

E[k∑

i=1

Mn,i(T )2] < ∞

for each T > 0. As a consequence, the sequence of random vectors Mn(T ) : n ≥ 1 isuniformly integrable; see (3.18) on p 31 of Billingsley (1999). Hence, Theorem 8.4 impliesthat the limit L is a martingale. Moreover, by condition (8.3) and Lemma 8.1, we have theconvergence

Mnk,iMnk,j − An,i,j ⇒ LiLj − ci,je in D as nk →∞ .

We can conclude that the limit LiLj − ci,je too is a martingale by applying Theorem 8.4,provided that we can show that the sequence Mnk,i(T )Mnk,j(T ) − Ank,i,j(T ) : nk ≥ 1 isuniformly integrable for each T > 0.

By (8.34), condition (8.3), and the inequality

2|Ank,i,j(T )| ≤ |Ank,i,i(T )|+ |Ank,j,j(T )| ,

the sequence Ank,i,j(T ) : nk ≥ 1 is uniformly integrable. Hence it suffices to focus on theother sequence Mnk,i(T )Mnk,j(T ) : nk ≥ 1, and since

2|Mnk,i(T )Mnk,j(T )| ≤ |Mnk,i(T )2|+ |Mnk,j(T )2| ,

it suffices to consider the sequence Mnk,i(T )2 : nk ≥ 1 for arbitrary i. The required argumentnow is simplified because the random variables are nonnegative. Since Mnk,i(T )2 ⇒ Li(T )2 asnk →∞, it suffices to show that E[Mnk,i(T )2] → E[Li(T )2] as nk →∞. By (8.36)–(8.38), wealready know that E[Mnk,i(T )2] → ci,iT as nk →∞. So what is left to prove is that

E[Li(T )2] = ci,iT , (8.69)

as claimed in (1.45) on p 344 of EK.For this last step, we introduce stopping times

τn(x) ≡ inf t ≥ 0 : Mn,i(t)2 > x and τ(x) ≡ inf t ≥ 0 : Li(t)2 > x , (8.70)

65

for each x > 0. We deviate from EK a bit here. It will be convenient to guarantee that thesestopping times are finite for all x. (So far, we could have L(t) = 0 for all t ≥ 0.) Accordingly,we define modified stopping times that necessarily are finite for all x. We do so by adding thecontinuous deterministic function (t− T )+ to the two stochastic processes. In particular, let

τn(x)′ ≡ inf t ≥ 0 : Mn,i(t)2 + (t− T )+ > x and τ(x)′ ≡ inf t ≥ 0 : Li(t)2 + (t− T )+ > x(8.71)

for x ≥ 0. With this modification, the stopping times τn(x)′ and τ(x)′ are necessarily finite forall x, and yet nothing is changed over the relevant time interval [0, T ]:

τn(x)′ ∧ t = τn(x) ∧ t and τ(x)′ ∧ t = τ(x) ∧ t for 0 ≤ t ≤ T (8.72)

for all x ≥ 0.After having made this minor modification, we can apply the continuous mapping theo-

rem with the inverse function as in §13.6 of Whitt (2002), mapping the subset of functionsunbounded above in D into itself, which for the cognoscenti will be a simplification. That is,we regard the first-passage times as stochastic processes indexed by x ≥ 0. For this step, wework with the stochastic processes (random elements of D) τn ≡ τn(x) : x ≥ 0 and similarlyfor the other first passage times. We need to be careful to make this work: We need to use aweaker topology on D on the range. But Theorem 13.6.2 of Whitt (2002) implies that

(τ ′nk, Mnk

) ⇒ (τ ′, L) in (D,M′1)× (D, J1) as k →∞ , (8.73)

which in turn, by the continuous mapping theorem again, implies that

(τnk(x)′ ∧ T, Mnk

) ⇒ (τ(x)′ ∧ T, L) in R× (D, J1) as k →∞ , (8.74)

for all x except the at most countably many x that are discontinuity points of the limitingstochastic process τ(x)′ : x ≥ 0. Since we have restricted the time argument to the interval[0, T ], we can apply (8.72) and use the original stopping times, obtaining

(τnk(x) ∧ T, Mnk

) ⇒ (τ(x) ∧ T,L) in R× (D, J1) as k →∞ (8.75)

again for all x except the at most countably many x that are discontinuity points of the limitingstochastic process τ ′ ≡ τ(x)′ : x ≥ 0. By the continuous mapping theorem once more, withthe composition map and the square, e.g., see VI.2.1 (b5) on p 301 of JS, we obtain

Mnk,i(t ∧ τnk(x))2 ⇒ L(t ∧ τ(x))2 in R as k →∞ (8.76)

for all but countably many x. At this point we have arrived at a variant of the conclusionreached on p 345 of EK. (We do not need to restrict the T because the limit process L hasbeen shown to have continuous paths.) It is not necessary to consider the inverse functionmapping a subset of D into itself, as we did; instead we could go directly to (8.75), but thatjoint convergence is needed in order to apply the continuous mapping theorem to get (8.76).

Now we have the remaining UI argument in (1.48) of EK, which closely parallels (8.41).We observe that the sequence Mnk,i(T ∧ τnk

(x))2 : k ≥ 1 is UI, because

Mnk,i(T ∧ τnk(x))2 ≤ 2

(x + J(M2

nk,i, T ))

(8.77)

whereE

[J(M2

nk,i, T )]→ 0 as k →∞ (8.78)

66

by virtue of condition (8.4), again using J in (5.9).As a consequence, as in (1.49) on p 345 of EK, for all but countably many x, we get

E[Li(T ∧ τ(x))2] = limk→∞

E[Mnk,i(T ∧ τnk(x))2]

= limk→∞

E[Ank,i,i(T ∧ τnk(x))]

= E[ci,i(T ∧ τ(x))] . (8.79)

Letting x →∞ through allowed values, we get τ(x) ⇒∞ and then the desired (8.69).

9. Applications of the Martingale FCLT

After the struggle through the proof of the martingale FCLT, any one application shouldseem pretty easy. In this section we make two applications of the preceding martingale FCLT.First, we apply it to prove the FCLT for the scaled Poisson process, Theorem 4.2. Then weapply it to provide a third proof of Theorem 1.1. In the same way we could obtain alternateproofs of Theorems 1.2 and 7.1.

9.1. Proof of the Poisson FCLT: Theorem 4.2.

We now prove Theorem 4.2. To do so, it suffices to consider the one-dimensional version in D,since the Poisson processes are mutually independent. Let the martingales Mn be MA,n be asdefined in (4.13), i.e.,

Mn ≡ MA,n(t) ≡ A(nt)− nt√n

, t ≥ 0 , (9.1)

with their internal filtrations. The limits in (8.1) and (8.4) hold, because

J(Mn, T ) ≤ 1√n

and J(M2n, T ) ≤ 1

n, n ≥ 1 . (9.2)

For each n ≥ 1, Mn is square integrable, the optional quadratic variation process is

[Mn](t) ≡ [Mn,Mn](t) =A(nt)

n, t ≥ 0 , (9.3)

and the predictable quadratic variation process is

〈Mn〉(t) ≡ 〈Mn,Mn〉(t) =nt

n≡ t, t ≥ 0 . (9.4)

Hence both (8.3) and (8.5) hold trivially. By the SLLN for a Poisson process,

A(nt)n

→ t w.p.1 as n →∞ (9.5)

for each t > 0. Hence both conditions (i) and (ii) in Theorem 8.1 are satisfied, with C = c1,1 =1.

67

9.2. Third Proof of Theorem 1.1.

The bulk of this paper has consisted of a proof of Theorem 1.1 based on the first martingalerepresentation in Theorem 3.4, which in turn is based on the representation of the service-completion counting process as a random time change of a rate-1 Poisson process, as in (2.2).A second proof in §4.3 established the fluid limit directly.

In this subsection we present a third proof of Theorem 1.1 based on the second martingalerepresentation in Theorem 3.5, which in turn is based on a random thinning of rate-1 Poissonprocesses, as in (2.5). This third proof also applies to the third martingale representation in§3.6, which is based on constructing martingales for counting processes associated with thebirth-and-death process Q(t) : t ≥ 0 via its infinitesimal generator.

Starting with the second martingale representation in Theorem 3.5 (or the third martingalerepresentation in Subsection 3.6), we cannot rely on the Poisson FCLT to obtain the requiredstochastic process limit

(Mn,1,Mn,2) ⇒ (√

µB1,√

µB2) in D2 as n →∞ , (9.6)

in (4.1). However, we can apply the martingale FCLT for this purpose, and we show how todo that now.

As in §9.1, we can apply either condition (i) or (ii) in Theorem 8.1, but it is easier to apply(ii), so we will. The required argument looks more complicated because we have to establishthe two-dimensional convergence in (9.6) in D2 because the scaled martingales Mn,1 and Mn,2

are not independent. Fortunately, however, they are orthogonal, by virtue of the followinglemma. That still means that we need to establish the two-dimensional limit in (9.6), but it isnot difficult to do so.

We say that two locally square-integrable martingales with respect to the filtration F, M1

and M2, are orthogonal if the process M1M2 is a local martingale with M1(0)M2(0) = 0.Since M1M2 − 〈M1,M2〉 is a local martingale, orthogonality implies that 〈M1,M2〉(t) = 0 forall t.

Lemma 9.1. (orthogonality of stochastic integrals with respect to orthogonal martingales)Suppose that M1 and M2 are locally square-integrable martingales with respect to the filtrationF, where M1(0) = M2(0) = 0, while C1 and C2 are locally-bounded F-predictable processes. IfM1 and M2 are orthogonal, (which is implied by independence), then the stochastic integrals∫ t0 C1(s) dM1(s) and

∫ t0 C2(s) dM2(s) are orthogonal, which implies that

⟨∫C1(s) dM1(s),

∫C2(s) dM2(s)

⟩(t) = 0, t ≥ 0 .

and [∫C1(s) dM1(s),

∫C2(s) dM2(s)

](t) = 0, t ≥ 0 .

Lemma 9.1 follows from the following representation for the quadratic covariation of thestochastic integrals; see §5.9 of van der Vaart (2006).

Lemma 9.2. (Quadratic covariation of stochastic integrals with respect to martingales) Sup-pose that M1 and M2 are locally square-integrable martingales with respect to the filtration F,while C1 and C2 are locally-bounded F-predictable processes. Then

∫ t

0C1(s) dM1(s)

∫ t

0C2(s) dM2(s)−

[∫C1(s) dM1(s),

∫C2(s) dM2(s)

](t) : t ≥ 0

68

and∫ t

0C1(s) dM1(s)

∫ t

0C2(s) dM2(s)−

⟨∫C1(s) dM1(s),

∫C2(s) dM2(s)

⟩(t) : t ≥ 0

are local F-martingales, where the quadratic covariation processes are[∫

C1(s) dM1(s),∫

C2(s) dM2(s)]

(t) =∫ t

0C1(s)C2(s) d[M1, M2](s)

and ⟨∫C1(s) dM1(s),

∫C2(s) dM2(s)

⟩(t) =

∫ t

0C1(s)C2(s) d〈M1,M2〉(s), t ≥ 0 .

As a consequence of the orthogonality provided by Lemma 9.1, we have [Mn,1,Mn,2](t) = 0and 〈Mn,1,Mn,2〉(t) = 0 for all t and n for the martingales in (9.6), which in turn come fromTheorem 3.5. Thus the orthogonality trivially implies that

[Mn,1, Mn,2] ⇒ 0 and 〈Mn,1,Mn,2〉(t) ⇒ 0 in R as n →∞for all t ≥ 0. We then have

〈Mn,i, Mn,i〉(t) ⇒ ci,it = µt in R as n →∞for each t and i = 1, 2 by (3.43) in Theorem 3.5 and Lemma 4.2, as in the previous argumentused in the first proof of Theorem 1.1 in §§4.1-6.2. As stated above, the bulk of the proof isthus identical. By additional argument, we can also show that

[Mn,i,Mn,i](t) ⇒ ci,it = µt in R as n →∞ .

starting from (3.44).We have just shown that (8.5) holds. It thus remains to show the other conditions in The-

orem 8.1 (ii) are satisfied. First, since we have a scaled unit-jump counting process, condition(8.4) holds by virtue of the scaling in Theorem 3.5 and (3.26). Next (8.3) holds trivially be-cause the predictable quadratic variation processes 〈Mn,1〉 and 〈Mn,2〉 are continuous. Hencethis third proof is complete.

In closing this section, we observe that this alternate method of proof also applies to theErlang-A model in §7.1 and the generalization with finite waiting rooms in Theorem 1.2.

10. Concluding Remarks

This paper has been intended to prepare readers to apply the martingale method to provestochastic-process limits for harder queueing models, such as the GI/GI/∞ model treatedby Krichagina and Puhalskii (1997), the GI/PH/n model treated by Puhalskii and Reiman(2000), and non-Markovian generalizations of the service-network models treated by Gurvichand Whitt (2006) and references cited there.

There also are other issues, which take us away from the martingale method. For example,there is also Puhalskii’s (1994) continuous-mapping-theorem argument to establish associatedlimits for scaled waiting-time processes; also see §13.7, especially Theorem 13.7.4, in Whitt(2002) and §5.4 in the Internet Supplement, Whitt (2002a). That approach was applied byPuhalskii and Reiman (2000), Whitt (2005) and others.

There is also the extra steps to establish heavy-traffic limits for the steady-state distri-butions of the queueing models, as was done in Halfin and Whitt (1981). See Gamarnikand Zeevi (2006) for a recent related result for generalized Jackson queueing networks. Forone-dimensional diffusion processes, there is a systematic characterization of the steady-statedistributions; see Browne and Whitt (1995).

69

11. Acknowledgments.

The author was supported by NSF Grant NSF grant DMI-0457095. The author thanks JimDai, Tom Kurtz, Columbia doctoral students Itay Gurvich, Guodong Pang and Rishi Talreja,and anonymous referees for helpful comments on this paper.

70

References

[1] Aldous, D. 1978. Stopping times and tightness. Ann. Probab. 6, 335–340.

[2] Aldous, D. 1989. Stopping times and tightness II. Ann. Probab. 17, 586–595.

[3] Armony, M. 2005. Dynamic routing in large-scale service systems with heterogeneousservers. Queueing Systems 51, 287–329.

[4] Armony M., I., Gurvich and C. Maglaras. 2006. Cross-selling in a call center with aheterogeneous customer population. Working Paper, New York University, New York,NY, and Columbia University, New York, NY.

[5] Armony M., I. Gurvich I. and A. Mandelbaum. 2006. Service level differentiation in callCenters with fully flexible servers. Management Science, forthcoming.

[6] Atar R. 2005. Scheduling control for queueing systems with many servers: asymptoticoptimality in heavy traffic. Ann. Appl. Probab. 15, 2606-2650.

[7] Atar R., A. Mandelbaum and M. I. Reiman. 2004a. Scheduling a multi-class queue withmany exponential servers. Ann. Appl. Probab. 14, 1084-1134.

[8] Atar R., A. Mandelbaum and M. I. Reiman. 2004b. Brownian control problems for queue-ing systems in the Halfin-Whitt regime. Ann. Appl. Probab. 14, 1084-1134.

[9] Billingsley, P. 1968. Convergence of Probability Measures, Wiley (second edition, 1999).

[10] Billingsley, P. 1974. Conditional distributions and tightness. Ann. Probab. 2, 480–485.

[11] Borovkov, A. A. 1967. On limit laws for service processes in multi-channel systems (inRussian). Siberian Math J. 8, 746–763.

[12] Borovkov, A. A. 1984. Asymptotic Methods in Queueing Theory, Wiley, New York.

[13] Borst, S., A. Mandelbaum and M. I. Reiman. 2004. Dimensioning large call centers. Oper.Res. 52, 17–34.

[14] Bremaud, P. 1981. Point Processes and Queues: Martingale Dynamics, Springer.

[15] Browne, S. and W. Whitt. 1995. Piecewise-linear diffusion processes. In Advances inQueueing: Theory, Methods and Open Problems, J. Dshalalow (ed.), CRC Press, 463–480.

[16] Dai J. G. and T. Tezcan. 2005. State space collapse in many server diffusion limits ofparallel server systems. Working Paper, Georgia Institute of Technology, Atlanta, GA.

[17] Dupuis, P. and H. Ishii. 1991. On when the solution to the Skorohod problem is Lipschitzcontinuous with applications. Stochastics 35, 31–62.

[18] Ethier, S. N. and T. G. Kurtz. 1986. Markov Processes; Characterization and Convergence,Wiley.

[19] Gamarnik, D. and A. Zeevi. 2006. Validity of heavy traffic steady-state approximations ingeneralized Jackson networks. Ann. Appl. Probability 16, 56–90.

71

[20] Garnett, O., A. Mandelbaum and M. I. Reiman. 2002. Designing a call center with impa-tient customers. Manufacturing Service Oper. Management 4, 208–227.

[21] Glynn, P. W. and W. Whitt. 1991. A new view of the heavy-traffic limit theorem for theinfinite-server queue. Adv. Appl. Prob. 23, 188–209.

[22] Gurvich, I. and W. Whitt. 2006. Fixed-queue-ratio routing in many-server service systems.working paper, Columbia University. Available at: http://www.columbia.edu/ ww2040/

[23] Halfin, S. and W. Whitt. 1981. Heavy-traffic limits for queues with many exponentialservers. Oper. Res. 29, 567–588.

[24] Harrison, J. M. and A. Zeevi. 2005. Dynamic scheduling of a multiclass queue in the Halfinand Whitt heavy traffic regime. Oper. Res. 52, 243–257.

[25] Helms, L. L. 1974. Ergodic properties of several interacting Poisson particles. Advancesin Math. 12, 32–57.

[26] Iglehart, D. L. 1965. Limit diffusion approximations for the many server queue and therepairman problem. J. Appl. Prob. 2, 429–441.

[27] Jacod, J., J Memin, M. Metivier. 1983. On tightness and stopping times. Stochastic Proc.Appl. 14, 109–146.

[28] Jacod, J. and A. N. Shiryayev. 1987. Limit Theorems for Stochastic Processes, Springer.

[29] Jelenkovic, P., A. Mandelbaum and P. Momcilovic. 2004. Heavy traffic limits for queueswith many deterministic servers. Queueing Systems 47, 53–69.

[30] Kallenberg, O. 2002. Foundations of Modern Probability, second edition, Springer.

[31] Karatzas, I, and S. Shreve. 1988. Brownian Motion and Stochastic Calculus, Springer.

[32] Krichagina, E. V. and A. A. Puhalskii. 1997. A heavy-traffic analysis of a closed queueingsystem with a GI/∞ service center. Queueing Systems 25, 235–280.

[33] Kurtz, T. 1975. Semigoups of conditioned shifts and approximation of Markov processes.Ann. Probab. 3, 618–642.

[34] Kurtz, T. 1978. Strong approximation theorems for density dependent Markov chains.Stoch. Process Appl. 6, 223–240.

[35] Kurtz, T. 1980. Representations of Markov processes as multiparameter time changes.Ann. Probability 8, 682–715.

[36] Kurtz, T. 1981. Approximation of Population Processes, SIAM, Philadelphia, PA.

[37] Kurtz, T. 2001. Lectures on Stochastic Analysis, Department of Mathematics and Statis-tics, University of Wisconsin, Madison, WI 53706-1388.

[38] Lions, P. and A. Sznitman. 1984. Stochastic differential equations with reflecting boundaryconditions. Commun. Pure Appl. Math. 37, 511–537.

[39] Liptser, R. Sh. and A. N. Shiryayev. 1986. Theory of Martingales, Kluwer.

[40] Mandelbaum, A., W. A. Massey and M. I. Reiman. 1998. Strong approximations forMarkovian service networks. Queueing Systems 30, 149–201.

72

[41] Mandelbaum, A. and G. Pats. 1995. State-dependent queues: approximations and applica-tions. In Stochastic Networks, F. P. Kelly, R. J. Williams (eds.), Institute for Mathematicsand its Applications, Vol. 71, Springer, 239–282.

[42] Mandelbaum, A. and G. Pats. 1998. State-dependent stochastic networks. Part I: Approx-imations and Applications with continuous diffusion limits. Ann. Appl. Prob. 8, 569–646.

[43] Mandelbaum, A. and S. Zeltyn. 2005. The Erlang-A/Palm queue, with applica-tions to call centers. Wrking paper, The Technion, Haifa, Israel. Available at:http://iew3.technion.ac.il/serveng/References/references.html

[44] Massey, W. A. and W. Whitt. 1993. Networks of infinite-server queues with nonstationaryPoisson input. Queueing Systems 13, 183–250.

[45] McLeish, D. L. 1974. Dependent central limit theorems and invariance principles. Ann.Probab. 2, 620–628.

[46] Parthasarathy, K. R. 1967. Probability Measures on Metric Spaces, Academic Press.

[47] Protter, P. E. 2005. Stochastic Integration and Differential Equations, second edition,Springer.

[48] Puhalskii, A. A. 1994. On the invariance principle for the first passage time. Math. Oper.Res. 19, 946–954.

[49] Puhalskii, A. A. and M. I. Reiman. 2000. The mutliclass GI/PH/N queue in the Halfin-Whitt regime. Adv. Appl. Prob. 32, 564-595.

[50] Rebolledo, R. 1980. Central limit theorems for local martingales. Zeitschrift Wahrschein-lichkeitstheorie verw. Gebiete 51, 269–286.

[51] Reed, J. 2005. The G/GI/N queue in the Halfin-Whitt regime. working paper, GeorgiaInstitute of Technology.

[52] Rogers, L. C. G. and D. Williams. 1987. Diffusions, Markov Processes and Martingales,Volume 2: Ito Calculus, Wiley.

[53] Rogers, L. C. G. and D. Williams. 2000. Diffusions, Markov Processes and Martingales,Volume 1: Foundations, Cambridge University Press.

[54] Skorohod, A. V. 1956. Limit theorems for stochastic processes. Prob. Theory Appl. 1,261–290.

[55] Srikant, R. and W. Whitt. 1996. Simulation run lengths to estimate blocking probabilities.ACM Trans. Modeling Computer Simulations 6, 7–52.

[56] Stone, C. 1963. Limit theorems for random walks, birth and death processes and diffusionprocesses. Illinois J. Math. 4, 638–660.

[57] Stroock, D. and S. R. S. Varadhan. 1979. Multidimensional Diffusion Processes, Springer-Verlag, New York.

[58] Tezcan T. 2006. Optimal control of distributed parallel server systems under the Halfinand Whitt regime. Working Paper. Georgia Institute of Technology, Atlanta, GA.

73

[59] van der Vaart, A. W. 2006. Martingales, Diffusions and Financial Mathematics LectureNotes, Available at: http://www.math.vu.nl/sto/onderwijs/mdfm/

[60] Ward, A. R. and P. W. Glynn. 2003a. A diffusion approximation for a Markovian queuewith reneging. Queueing Systems 43, 103–128.

[61] Ward, A. R. and P. W. Glynn. 2003b. Properties of the reflected Ornstein-Uhlenbeckprocess. Queueing Systems 44, 109–123.

[62] Ward, A. R. and P. W. Glynn. 2005. A diffusion approximation for a GI/GI/1 queuewith balking or reneging. Queueing Systems 50, 371–400.

[63] Whitt, W. 1982. On the heavy-traffic limit theorem for GI/G/∞ queues. Adv. Appl. Prob.14, 171–190.

[64] Whitt, W. 2002. Stochastic-Process Limits, Springer.

[65] Whitt, W. 2002a. Internet Supplement to Stochastic-Process Limits,Available at: http://www.columbia.edu/∼ww2040/supplement.html

[66] Whitt, W. 2004. Efficiency-driven heavy-traffic approximations for many-server queueswith abandonments. Management Science 50, 1449–1461.

[67] Whitt, W. 2005. Heavy-traffic limits for the G/H∗2/n/m queue. Math. Oper. Res. 30, 1–27.

[68] Zeltyn S. and A. Mandelbaum. 2005. Call centers with impatient customers: many-serverasymptotics of the M/M/n+G queue. Queueing Systems 51, 361–402.

74

APPENDIX

A. Proof of Lemma 3.1.

Let ∆ be the jump function, i.e., ∆X(t) ≡ X(t)−X(t−) and let∑

s≤t

(∆X(s))2, t ≥ 0 ,

represent the sum of all squared jumps. Since (∆X(s))2 > 0, the sum is independent of theorder. Hence the sum is necessarily well defined, although possibly infinite. (For any positiveinteger n, a function in D has at most finitely many jumps with |x(t)− x(t−)| > 1/n over anybounded interval; see Lemma 1 on p. 110 of Billingsley (1968). Hence, the sample paths ofeach stochastic process in D have only countably many jumps.)

The optional quadratic variation has a very simple structure for locally-bounded-variationprocesses, i.e., processes with sample paths of bounded variation over every bounded interval;see (18.1) on p. 27 of Rogers and Williams (1987).

Lemma A.1. (optional quadratic covariation and variation for a locally-bounded-variationprocess) If stochastic processes X and Y almost surely have sample paths of bounded variationover each bounded interval, then

[X,Y ](t) =∑

s≤t

(∆X(s))(∆Y (s)), t ≥ 0 , (1.1)

and[X](t) ≡ [X, X](t) =

∑

s≤t

(∆X(s))2, t ≥ 0 . (1.2)

Lemma A.2. (optional quadratic variation for a counting process) If N is a non-explosiveunit-jump counting process with compensator A, both adapted to the filtration F, then N −A isa locally square-integrable martingale of locally bounded variation with square-bracket process

[N−A](t) =∑

s≤t

(∆(N−A)(s))2 = N(t)−2∫ t

0∆A(s) dN(s)+

∫ t

0∆A(s) dA(s), t ≥ 0 . (1.3)

If, in addition, the compensator A is continuous, then

[N −A] = N , (1.4)

Proof. For (1.3), we apply (1.2) in Lemma A.1; see §5.8 of van der Vaart (2006). For (1.4),we observe that the last two terms in (1.3) become 0 when A is continuous.

Proof of Lemma 3.1. We will directly construct the compensator of the submartingaleM(t)2 : t ≥ 0. Since (i) N is a non-explosive counting process, (ii) E[N(t)] < ∞ andE[A(t)] < ∞ for all t and (iii) N and A have nondecreasing sample paths, the sample paths ofthe martingale M ≡ N − A provided by Theorem 3.1 are of bounded variation over boundedintervals. Since N(0) = 0, M(0) = 0. We can thus start by applying integration by parts,

75

referred to as the product formula on p. 336 of Bremaud (1981), to write

M(t)2 =∫ t

0M(s−) dM(s) +

∫ t

0M(s) dM(s)

= 2∫ t

0M(s−) dM(s) +

∫ t

0(M(s)−M(s−)) dM(s)

= 2∫ t

0M(s−) dM(s) + [M ](t)

= 2∫ t

0M(s−) dM(s) + N(t), t ≥ 0 , (1.5)

The last step follows from (1.4) in Lemma A.2.We now want to show that the stochastic integral

∫ t0 M(s−) dM(s) is a martingale. To get

the desired preservation of the martingale structure, we can apply the integration theorem,Theorem T6 on p. 10 of Bremaud (1981), but we need additional regularity conditions. Atthis point, we localize in order to obtain boundedness.

To obtain such bounded martingales associated with N and M ≡ N −A, let the stoppingtimes be defined in the obvious way by

τm ≡ inft ≥ 0 : |M(t)| ≥ m or A(t) ≥ m . (1.6)

Then τm ≤ t ∈ Ft, t ≥ 0. We now define the associated stopped processes: Let

Mm(t) ≡ M(t ∧ τm), Nm(t) ≡ N(t ∧ τm) and Am(t) ≡ A(t ∧ τm), t ≥ 0 , (1.7)

for each m ≥ 1. Then Nm(t) = Mm(t) + Am(t) for t ≥ 0 and Mm(t) : t ≥ 0 is a martingalewith respect to Ft having compensator Am(t) : t ≥ 0 for each m ≥ 1, as claimed. Moreover,all three stochastic processes Nm, Mm and Am are bounded. The boundedness follows sinceN has unit jumps and A is continuous.

We then obtain the representation for these stopped processes corresponding to (1.5); inparticular,

Mm(t)2 = 2∫ t

0Mm(s−) dMm(s) + Nm(t), t ≥ 0 , (1.8)

for each m ≥ 1. With the extra boundedness provided by the stopping times, we can applyTheorem T6 on p. 10 of Bremaud (1981) to deduce that the integral

∫ t0 Mm(s−)dMm(s) is

an F-martingale. First, Mm(t) is a martingale of integrable bounded variation with respectto F, as defined on p. 10 of Bremaud (1981). Moreover, the process Mm(t−) : t ≥ 0 is anF-predictable process such that

∫ t

0|Mm(s−)| d|Mm|(s) < ∞ .

Theorem T6 on p. 10 of Bremaud (1981) implies that the integral∫ t0 Mm(s−)dMm(s) is an F-

martingale. Thus Mm(t)2−Nm(t) : t ≥ 0 is an F-martingale. But Nm(t)−Am(t) : t ≥ 0is also an F-martingale. Adding, we see that Mm(t)2 −Am(t) : t ≥ 0 is an F-martingale foreach m. Thus, for each m, the predictable quadratic variation of Mm is

〈Mm〉(t) = Am(t). (1.9)

Now we can let m ↑ ∞ and apply Fatou’s Lemma to get

E[M(t)2] = E[

limm→∞Mm(t)2

]

≤ lim infm→∞ E

[Mm(t)2

]= lim inf

m→∞ E[Am(t)

]= E

[A(t)

]< ∞.

76

Therefore, M itself is square integrable. We can now apply the monotone convergence theoremin the conditioning framework, as on p. 280 of Bremaud, to get

E[Mm(t + s)2|Ft] → E[M(t + s)2|Ft] and E[Am(t + s)|Ft] → E[A(t + s)|Ft]

as m →∞ for each t ≥ 0 and s > 0. Then, since

E[Mm(t + s)2 −Am(t + s)|Ft] = Mm(t)2 −Am(t) for all m ≥ 1 ,

Mm(t) → M(t) and Am(t) → A(t), we have

E[M(t + s)2 −A(t + s)|Ft] = M(t)2 −A(t)

as well, so that M2 − A is indeed a martingale. Of course that implies that 〈M〉 = A, asclaimed. We get [M ] = N from Lemma A.2, as noted at the beginning of the proof.

We remark that there is a parallel to Lemma A.2 for the angle-bracket process, applyingto cases in which the compensator is not continuous. In contrast to Lemma 3.1, we now donot assume that E[N(t)] < ∞, so we need to localize.

Lemma A.3. (predictable quadratic variation for a counting process) If N is a non-explosiveunit-jump counting process with compensator A, both adapted to the filtration F, then N − Ais a locally square-integrable martingale of locally bounded variation with angle-bracket process

〈N −A〉(t) = 〈[N −A]〉(t)

= A(t)−∫ t

0∆A(s) dA(s) =

∫ t

0(1−∆A(s)) dA(s), t ≥ 0 . (1.10)

If, in addition, the compensator A is continuous, then

〈N −A〉 = A . (1.11)

Proof. For (1.10), we exploit the fact that 〈N − A〉 = 〈[N − A]〉; see p. 377 of Rogers andWilliams (1987) and §5.8 of van der Vaart (2006). The third term on the right in (1.3) ispredictable and thus its own compensator. The compensators of the first two terms in (1.3)are obtained by replacing N by its compensator A. See Problem 3 on p. 60 of Liptser andShiryayev (1987).

77

martingale proofs of many-server heavy ...ww2040/infiniteserver030407.pdftingale method for proving...

Documents