arxiv:1206.3390v2 [math.pr] 29 may 2013jblanche/papers/siis_rw_ht.pdf · arxiv:1206.3390v2...

arX

iv:1

206.

3390

v2 [

mat

h.PR

] 2

9 M

ay 2

013

STATE-INDEPENDENT IMPORTANCE SAMPLING FOR RANDOM WALKS

WITH REGULARLY VARYING INCREMENTS

Karthyek R. A. Murthy ∗a, Sandeep Junejaa, and Jose Blanchetb

aTata Institute of Fundamental Research, Mumbai, IndiabColumbia University, New York, United States

Abstract

We develop state-independent importance sampling based efficient simulation techniquesfor two commonly encountered rare event probabilities associated with random walk (Sn :n ≥ 0) having i.i.d. regularly varying heavy-tailed increments; namely, the level crossingprobabilities when the increments of Sn have a negative mean, and the the large deviationprobabilities PSn > b, as both n and b increase to infinity for the zero mean random walk.Exponential twisting based state-independent methods, which are effective in efficientlyestimating these probabilities for light-tailed increments are not applicable when these areheavy-tailed. To address the latter case, more complex and elegant state-dependent efficientsimulation algorithms have been developed in the literature over the last few years. Wepropose that by suitably decomposing these rare event probabilities into a dominant andfurther residual components, simpler state-independent importance sampling algorithms canbe devised for each component resulting in composite unbiased estimators with a desirablevanishing relative error property. When the increments have infinite variance, there is anadded complexity in estimating the level crossing probabilities as even the well known zerovariance measures have an infinite expected termination time. We adapt our algorithms sothat this expectation is finite while the estimators remain strongly efficient. Numerically,the proposed estimators perform at least as well, and sometimes substantially better thanthe existing state-dependent estimators in the literature.

1. Introduction

In this paper, we develop importance sampling algorithms involving simple, state-independentchanges of measure for the efficient estimation of large deviation probabilities, and level crossingprobabilities of random walks with regularly varying heavy-tailed increments. Specifically, let(Xn : n ≥ 1) denote a sequence of zero mean independent and identically distributed (i.i.d.)random variables such that P(Xn > x) = L(x)x−α, for some α > 1 and a slowly varyingfunction1 L(·). Note that α > 2 ensures finite variance for Xn whereas α < 2 implies that ithas infinite variance. Set S0 = 0 and Sn = X1 + . . . + Xn, for n ≥ 1. Given µ > 0, defineM := supn(Sn − nµ), and τb := infn ≥ 0 : Sn − nµ > b. We are interested in the importancesampling based efficient estimation of:

1. Large deviation probabilities PSn > b for b > nβ+ǫ with β := (α ∧ 2)−1 as n ր ∞, and

2. Level crossing probabilities Pτb < ∞, or equivalently, the tail probabilities PM > bas b ր ∞.

∗Corresponding author. E-mail address: [email protected] is, limx→∞ L(tx)/L(x) = 1 for any t > 0; prominent examples for slowly varying functions include

(log x)β for any β ∈ R.

1

http://arxiv.org/abs/1206.3390v2

For brevity, we refer to former as large deviations probabilities and the latter as level crossingprobabilities. Our methodology for estimating the large deviations probabilities easily extends tothe efficient estimation of PSN > u for a random N, when N is light-tailed2 and independentof increments Xn (popular in literature are N fixed or geometrically distributed) as u ր ∞ .However, in the interest of space, we do not explicitly consider the ‘random sum tail probabilities’estimation problem in this paper.

Importance sampling via appropriate change of measure has been extremely successful inefficiently simulating rare events, and has been studied extensively in both the light and heavytailed settings (see, e.g., Asmussen and Glynn (2007) for an introduction to rare event simulationand applications). In importance sampling for random walks, state-dependence essentiallymeans that the sampling distribution for generating the increment Xk depends on the realizedvalues of X1, . . . , Xk−1 (typically, through Sk−1); state-independence on the other hand impliesthat samples ofX1, . . . ,Xn can be drawn independently. State-independent methods often enjoyadvantages over state-dependent ones in terms of complexity of generating samples and easeof implementation. The zero-variance changes of measure for estimating the large deviationsand the level crossing probabilities are well known and are state-dependent (see, e.g., Junejaand Shahabuddin 2006). While typically unimplementable, they provide guidance in search forimplementable approximately zero variance importance sampling techniques.

In the light-tailed settings, large deviations analysis can be used to show that exponentialtwisting based state-independent importance sampling well approximates the zero variance mea-sure (see, e.g., Asmussen and Glynn (2007)) and also efficiently estimates the large deviationsas well as level crossing probabilities (see, e.g., Sadowsky and Bucklew (1990) and Siegmund(1976)). However, development of state independent techniques for these probabilities is harderin the heavy-tailed settings. Asmussen et al. (2000) provide an account of failure of simple largedeviations based simulation methods that approximate zero-variance measure in heavy-tailedsystems. Bassamboo et al. (2007) prove that any state-independent importance sampling changeof measure cannot efficiently simulate level crossing probability in a busy cycle of a heavy tailedrandom walk. The fact that the zero-variance measures for estimating both the large deviationsand the level crossing probabilities are state-dependent, and the above mentioned negative re-sults, have motivated research over the last few years in development of complex and elegantstate-dependent algorithms to efficiently estimate these probabilities (see, e.g., Dupuis et al.(2007), Blanchet and Glynn (2008), Blanchet and Liu (2008, 2012), and Chan et al. (2012)).

In this paper we introduce simple state-independent change of measures to estimate the largedeviations and the level crossing probabilities with regularly varying increments. We show thatthe proposed methods are provably efficient3 and perform at least as well as the existing state-dependent algorithms. Thus our key contribution is to question the prevailing view that oneneeds to resort to state-dependent methods for efficient computation of rare event probabilitiesinvolving ‘large number’ of heavy-tailed random variables. A key idea to be exploited in theestimation of probabilities considered is the fact that the corresponding rare event occurrenceis governed by the “single big jump” principle, that is, the most likely paths leading to theoccurrence of the rare event have one of the increments taking large value (see, for e.g., Fosset al. (2011) and the references therein). Our approach for estimating the large deviations

2As is well-known, X is light-tailed if the moment generating function E [exp(θX)] is finite for some θ > 0,and is heavy-tailed otherwise.

3We show that the estimators have asymptotically vanishing relative error; this corresponds to their coefficientof variation converging to zero as the event becomes rarer. We also have a related weaker notion of strong efficiencywhere the coefficient of variation of the estimators, and subsequently the number of i.i.d. replications required,remains bounded as the event becomes rarer. Weak efficiency is another standard notion of performance in rareevent simulation corresponding to a slow increase in the number of replications required as the event becomesrarer. These are briefly reviewed in Section 2.2.

2

probability PSn > b relies on decomposing it into a dominant and a residual component, anddeveloping efficient estimation techniques for both. For estimating the level crossing probabilityPτb < ∞, in addition to such a decomposition, we partition the event of interest into severalblocks that are sampled using appropriate randomization. When the increments have infinitevariance, there is an added complexity in estimating the level crossing probabilities as eventhe well known zero variance measure is known to have an infinite expected termination time.We modify our algorithms so that this expectation remains finite while the estimator remainsstrongly efficient although it may no longer have asymptotically vanishing relative error.

Our specific contributions are as follows:

1. We provide importance sampling estimators that achieve asymptotically vanishing rela-tive error for the estimation of PSn > b, as n ր ∞. Given n and ǫ > 0, our simulation

methodology is uniformly efficient for values of b larger than n12+ǫ when the increments Xn

have finite variance, and for b > n1α+ǫ in the case of increments having infinite variance

– thus operating throughout the large deviations regime where the well-known asymp-totics PSn > b ∼ nF (b) hold. Further, this is the first instance that we are aware ofwhere efficient simulation techniques for the large deviations probability include the caseof increments having infinite variance, which is not uncommon in practical applicationsinvolving heavy-tailed random variables.

2. For α > 1, we develop unbiased estimators for level crossing probabilities Pτb < ∞ thatachieve vanishing relative error as b ր ∞. These estimators require an overall compu-tational effort that scales as O(b) when variance of Xn is finite. This is similar to thecomplexity of the zero variance operator since, as is well known, the latter requires orderE[τb|τb < ∞] computation in generating a single sample and this is known to be linearin b when the variance of increments is finite (see Asmussen and Kluppelberg (1996)).However, since E[τb|τb < ∞] = ∞ for the case of increments having infinite variance,even the zero-variance measure (even if implementable) is no longer viable because froma computational standpoint, any useful estimator needs to have finite expected replica-tion termination time. For random walks with infinite variance increments, we developalgorithms such that:

(a) For α > 1.5, the associated estimators are strongly efficient and have O(b) expectedtermination time. As a converse, we also prove that for α < 1.5 no algorithm can bedevised in our framework that has both the variance and expected termination timesimultaneously finite. The situation is more nuanced when α = 1.5 and depends onthe form of the slowly varying function L(·).

(b) For α ≤ 1.5, each replication of the estimator terminate in O(b) time on an averageand we require only O(1) replications, thus resulting in overall complexity of O(b).The relative deviation (the ratio of the absolute difference between the estimator andthe true value with the true value) of the values returned by the algorithm is wellwithin the specified limits with high probability, even though the estimator varianceis infinite.

The above results for infinite increment variance, and in particular the bottleneck arisingat α = 1.5, closely mirror the results proved in Blanchet and Liu (2012) where vastlydifferent state-dependent algorithms are considered.

A brief discussion on practical applications and a literature review may be in order: Effi-cient estimation of the level crossing probability is important in many practical contexts, e.g., in

3

computing steady state probability of large delays in GI/GI/1 queues and in ruin probabilitiesin insurance settings (see, e.g., Asmussen and Glynn (2007)). Siegmund (1976) provides thefirst weakly efficient importance sampling algorithm for estimating the level crossing probabili-ties when the increments Xn are light-tailed using large deviations based exponentially twistedchange of measure. Sadowsky and Bucklew (1990) develop a weakly efficient algorithm forestimating P(Sn > na) for a > 0, and Xi light-tailed, again using exponential twisting basedimportance sampling distribution (also see Sadowsky (1996), Dupuis and Wang (2004), Blanchetet al. (2009), Dieker and Mandjes (2005) and Agarwal et al. (2013) for related analysis). Thisproblem is important mainly because it forms a building block to many more complex rare eventproblems involving combination of renewal processes: for examples in queueing, see Parekh andWalrand (1989) and in financial credit risk modeling, see Glasserman and Li (2005) and Bassam-boo et al. (2008). Research on efficient simulation of rare events involving heavy-tailed variablesfirst focussed on probabilities such as PSN > b in the simpler asymptotic regime where N isfixed or geometrically distributed and b ր ∞. In this simpler setting state-independent algo-rithms are easily designed (see, e.g., Asmussen et al. (2000), Juneja and Shahabuddin (2002),Asmussen and Kroese (2006)). In Rajhaa and Juneja (2012), it is shown that a variant cappedexponential twisting based state-independent importance sampling, which does not involve anydecomposition, provides a strongly efficient estimator for the large deviations probability thatwe consider in this paper.

Statistical analysis reveals that heavy-tailed distributions are very common in practice: inparticular, heavy-tailed increments with infinite variance are a convenient means to explain thelong-range dependence observed in tele-traffic data, and to model highly variable claim sizesin insurance settings. Popular references to this strand of literature include Embrechts et al.(1997), Resnick (1997), and Adler et al. (1998).

The organization of the remaining paper is as follows: In Section 2 we discuss preliminaryconcepts relevant to the problems addressed. We propose our importance sampling method forestimating the large deviations probability and prove its efficiency in Section 3. In Section 4,we develop algorithms for estimating the level crossing probability. Proofs of some of the keyresults pertaining to efficiency of proposed algorithms and their expected termination time aregiven in Section 5. Numerical experiments supporting our algorithms are given in Section 6followed by a brief conclusion in Section 7. Some of the more technical proofs are presented inthe appendix.

2. Preliminary Background

In this section we briefly review the use of importance sampling in estimating rare event proba-bilities, and the well-known asymptotics for relevant tail probabilities in the existing literature.Throughout this paper, we use Landau’s notation for describing asymptotic behaviour of func-tions: for given functions f : R+ → R

+ and g : R+ → R+, we say f(x) = O(g(x)) if there exists

c1 > 0 and x1 large enough such that f(x) ≤ c1g(x) for all x > x1; and f(x) = Ω(g(x)) if thereexists c2 > 0 and x2 large enough such that f(x) ≥ c2g(x) for all x > x2. We use f(x) = o(g(x))if f(x)/g(x) → 0, and f(x) ∼ g(x) if f(x)/g(x) → 1, as x ր ∞.

2.1. Rare event simulation and importance sampling. Let A denote a rare event onthe probability space (Ω,F ,P), i.e., z := P(A) > 0 is small (in our setup A corresponds to theevents Sn > b or τb < ∞). Suppose that we are interested in obtaining an estimator zfor z such that the relative deviation |z − z|/z < ǫ, with probability at least 1 − δ, for given ǫand δ > 0. Naive simulation for estimating z involves drawing N independent samples of theindicator IA and taking their sample mean as the estimator. For a different measure P(·) such

4

that the Radon-Nikodym derivative dP/dP is well defined on A, we get:

P(A) =

∫

A

dP

dP(ω)dP(ω) = E [LIA] ,

where L := dP/dP and E[·] is the expectation associated with P(·). Define Z := LIA; then Z isan unbiased estimator of z under measure P(·). If N i.i.d samples Z1, . . . , ZN of Z are drawnfrom P(·), then by the strong law of large numbers we have:

zN:=

Z1 + . . .+ ZN

N→ z a.s.,

as N ր ∞. This method of arriving at an estimator is called importance sampling (IS). Themeasure P(·) is called the importance sampling measure and Z is called an importance samplingestimator. Using Chebyshev’s inequality allows us to find an upper bound on the number ofreplications N required to achieve the desired relative precision:

P

( |zN− z|z

> ǫ

)

≤ Var(zN)

z2ǫ2=

CV 2(Z)

Nǫ2.

Here CV (Z) =√

Var(Z)/z is the coefficient of variation of Z. This enables us to conclude thatif we generate at least

N =CV 2(Z)

δǫ2(1)

i.i.d. samples of Z for computing zN, we can guarantee the desired relative precision. In naive

simulation we use the measure P(·) itself and have Z = IA as the estimator; so the numberof samples required in (1) grows (roughly proportional to z−1) to infinity if z ց 0. As is wellknown, the choice P

∗(·) := P(·|A) as an importance sampling measure yields zero variance forthe associated estimator Z = zIA (see e.g., Asmussen and Glynn (2007)). Then, every sampleobtained in simulation equals z with P

∗(·) probability 1. However, the explicit dependence ofZ on z, the quantity which we want to estimate, makes this method impractical.

2.2. Efficiency notions of algorithms. Consider a family of events An : n ≥ 1 such thatzn := P(An) ց 0 as the rarity parameter n ր ∞. For an importance sampling algorithm tocompute (zn : n ≥ 1), we come up with a sequence of changes of measure (Pn(·) : n ≥ 1) andestimators (Zn : n ≥ 1) such that EnZn = zn, where En[·] denotes the expectation operatorunder Pn(·).Definition 1. The sequence (Zn : n ≥ 1) of unbiased importance sampling estimators of zn :n ≥ 1, is said to achieve asymptotically vanishing relative error if,

limn→∞

En

[

Z2n

]

z2n≤ 1. (2)

The sequence (Zn : n ≥ 1) is said to be strongly efficient if,

limn→∞

En

[

Z2n

]

z2n< ∞, (3)

and weakly efficient if for all ǫ > 0,

limn→∞

En

[

Z2n

]

z2−ǫn

< ∞. (4)

5

The significance of these definitions can be seen from (1): if an algorithm is strongly effi-cient, the number of simulation runs required to guarantee the desired relative precision staysbounded as n ր ∞. If Var(Zn) = o

(

z2n)

, then (Zn : n ≥ 1) satisfies asymptotically vanishingrelative error property. As a result, it is enough to generate o(δ−1ǫ−2) i.i.d. replications ofthe estimator. As is apparent from the definition, all strongly efficient algorithms are weaklyefficient, and vanishing relative error is the strongest notion among all three. Also it can beverified that naive simulation is not even weakly efficient.

2.3. Related asymptotics. In this section, we list the well-known asymptotics of thequantities of interest; these asymptotic representations will be useful for arriving at importancesampling measures and proving their efficiency.

1. Recall that β := (α ∧ 2)−1. Then we have,

PSn > b ∼ nF (b), as n ր ∞ (5)

for b > nβ+ǫ, ǫ > 0. (see Mikosch and Nagaev (1998) and references therein). Additionally,the following relations can be found in Mikosch and Nagaev (1998): as n ր ∞,

P

Sn > b,maxk≤n

Xk < b

= o(

nF (b))

, (6)

supb>nβ+ǫ

|P #1 ≤ i ≤ n : Xi > b = 1|Sn ≥ b − 1| = o(1), and

supb>nβ+ǫ

∣

∣

∣

∣

P

maxk≤n−1

Xk ≤ b, Sn ≥ b|Xn > b

− 1

∣

∣

∣

∣

= o(1).

These large deviations asymptotics reveal that with the number of summands growingto infinity, with high probability, the sum becomes large because one of the componentincrements becomes large.

2. Recall that τb := infk : Sk > b+ kµ and M := supn(Sn − nµ); the events M > b andτb < ∞ are the same. Let FI(·) denote the integrated tail of F (·) as below:

FI(x) :=

∫ ∞

xF (u)du, for x ≥ 0.

The following asymptotics are well-known (see, for e.g., in Foss et al. (2011) and referencestherein). As b ր ∞, uniformly for any positive integer n,

Pτb < n ∼ 1

µ

∫ b+nµ

bF (u)du, and

P τb < ∞ ∼ 1

µFI(b). (7)

Then for any positive integers n1 and n2 with n1 < n2,

Pn1 < τb ≤ n2 ∼ 1

µ

∫ b+n2µ

b+n1µF (u)du =

FI(b+ n1µ)− FI(b+ n2µ)

µ, (8)

uniformly in n1 and n2, as b ր ∞.

Further, the following characterization of the zero-variance measure P·|τb < ∞, as inTheorem 1.1 of Asmussen and Kluppelberg (1996), sheds light on how the first passage

6

over a level b happens asymptotically: If we use a(b) := FI(b)/F (b), then conditional onτb < ∞,

(

τba(b)

,

(

S⌊uτb⌋

τb: 0 ≤ u < 1

)

,Sτb − b

a(b)

)

=⇒(

Y0

µ, (−uµ : 0 ≤ u < 1), Y1

)

(9)

in R × D[0, 1) × R. The joint law of Y0, Y1 is defined as follows: for y0, y1 ≥ 0,PY0 >

y0, Y1 > y1 = PY1 > y0 + y1 with Y0d= Y1, and

PY1 > y1 =1

(1 + y1/(α− 1))α−1 .

Now we state a part of Karamata’s theorem that provides an asymptotic characterization ofthe integrated tails of regularly varying functions: Consider a regularly varying function V (·)with index −α; if β is such that α− β > 1,

∫ ∞

xuβV (u)du ∼ xβ+1V (x)

α− β − 1, as x ր ∞. (10)

See Embrechts et al. (1997) or Borovkov and Borovkov (2008) for further details.

3. Simulation of Sn > b

Let X be a zero mean random variable with distribution F (·) satisfying the following:

Assumption 1. The tail probabilities are given by F (x) := PX > x = x−αL(x), for someslowly varying function L(·) and α > 1. If Var[X] = ∞, then

limx→∞

PX < −xPX > x < ∞.

For the independent collection (Xn : n ≥ 1) of random variables which are distributed identicallyas X, define the random walk (Sn : n ≥ 0) as below:

S0 = 0, and Sn = X + 1 + . . . +Xn for n ≥ 1.

In this section we devise a simulation procedure for estimating the large deviation probabilitiesPSn > b for b > nβ+ǫ given any ǫ > 0, and prove its efficiency as n ր ∞. Recall thatβ := (α∧ 2)−1. The strategy is to partition the event Sn > b based on whether the maximumof the increments X1, . . . ,Xn has exceeded the large value b or not (Juneja 2007 considersthis approach when n is fixed) :

Adom(n, b) :=

Sn > b,maxk≤n

Xk ≥ b

and Ares(n, b) :=

Sn > b,maxk≤n

Xk < b

.

The asymptotics (5) and (6) in Section 2.3 indicate that for large values of n, the most likelyway for the sum Sn to exceed b is to have at least one of the increments X1, . . . ,Xn exceedb. Hence the probability of the event Ares is vanishingly small compared to the probability ofAdom, as n ր ∞; the suffixes stand to indicate that Adom is the dominant way of occurrence ofSn > b for large n, and the other event has only residual contributions. We estimate P(Adom)and P(Ares) independently via different changes of measure that typify the way in which therespective events occur, and add the individual estimates to arrive at the final estimator forPSn > b.

7

3.1. Simulating Adom. For the simulation of Adom, we follow the two-step procedure outlinedin Chan et al. (2012):

1. Choose an index I uniformly at random from 1, . . . , n

2. For k = 1, . . . , n, generate a realization of Xk from F (·|Xk ≥ b) if k = I; otherwise,generate Xk from F (·).

Let P(·) denote the measure induced when the increments are generated according to the aboveprocedure, and let E[·] denote the corresponding expectation operator. Note that the probabilitymeasure P(·) is absolutely continuous with respect to P(·) when restricted to Adom. We have,

dP (x1, . . . , xn) =

n∑

k=1

1

n· dF (x1) . . . dF (xn)

F (b)1(xk ≥ b).

Therefore the likelihood ratio on the set Adom is given by,

dP

dP(X1, . . . ,Xn) =

nF (b)

#Xi ≥ b : 1 ≤ i ≤ n ,

and the resulting unbiased estimator for the evaluation of P(Adom) is,

Zdom(n, b) :=nF (b)

#Xi ≥ b : 1 ≤ i ≤ nI(Adom). (11)

Generate N independent realizations of Zdom and take their sample mean as an estimatorof P(Adom). To evaluate how large N should be chosen so that the computed estimate satisfiesthe given relative error specification, we need to obtain bounds on the variance of Zdom. Since#Xi ≥ b : 1 ≤ i ≤ n is at least 1, when the increments are drawn following the measure P(·),we have: Zdom(n, b) ≤ nF (b), and hence,

E[

Z2dom(n, b)

]

≤(

nF (b))2

.

Also E [Zdom(n, b)] = P(Adom(n, b)) ∼ PSn > b ∼ nF (b), as n ր ∞. Therefore we get,

Var [Zdom(n, b)] = o(

(

nF (b))2)

, as n ր ∞. (12)

3.2. Simulating Ares. We see that all the increments X1, . . . ,Xn are bounded from aboveby b on the occurrence of event Ares. Though the bound on the increments vary with n, we canemploy methods similar to exponential twisting of light-tailed random walks to simulate theevent Ares, as illustrated in this section. For given b, define

Λb(θ) := log

(∫ b

−∞exp(θx)F (dx)

)

, θ ≥ 0.

Since the upper limit of integration is b, Λ(·) is well-defined for any positive value of θ. For givenvalues of n and b, consider the distribution function Fθ(·) satisfying,

dFθ(x)

dF (x)= exp(θnx− Λb(θn,b))1(x < b),

for all x ∈ R and some θn,b > 0. Now the prescribed procedure is to just obtain independentsamples of the increments X1, . . . ,Xn from Fθ(·) and compute the likelihood ratio due to the

8

procedure of sampling from a different distribution Fθ(·). Let Pθ(·) and Eθ[·] denote, respec-tively, the corresponding importance sampling change of measure and its associated expectationoperator. Note that the dependence of Fθ(·),Pθ(·) and Eθ[·] on n and b has been suppressed inthe notation. Then for given values of n and b, we have the following unbiased estimator forthe computation of P(Ares) :

Zres(n, b) := exp (−θnSn + nΛb(θn,b)) I(Ares). (13)

Now generate independent replications of Zres and take their sample mean as the computedestimate for P(Ares). However it remains to choose θn,b. Since Sn is larger than b on Ares,

Zres(n, b) ≤ exp (−θnb+ nΛb(θn,b)) I(Ares).

If we choose

θn,b := − log(

nF (b))

b, then (14)

Zres(n, b) ≤ nF (b) exp (nΛb(θn,b)) I(Ares). (15)

We use Lemma 1, which is proved in the appendix, to obtain an upper bound on the secondmoment of the estimator Zres.

Lemma 1. For the choice of θn,b as in (14),

exp (Λb(θn,b)) ≤ 1 +1

n(1 + o(1)),

as n ր ∞, uniformly for b > nβ+ǫ.

Therefore there exists a constant c such that

exp (nΛb(θn,b)) ≤ c,

for all admissible values of n and b.We evaluate the second moment of the estimator Zres throughthe equivalent expectation operation corresponding to the original measure P(·) as below:

Eθ

[

Z2res(n, b)

]

= E [Zres(n, b)] ≤ cnF (b)P(Ares),

where the last inequality follows from (15). Since P(Ares) = o(

nF (b))

, as in (6), we obtainthat:

Var [Zres(n, b)] = o(

(nF (b))2)

, as n ր ∞, (16)

thus arriving at the following theorem:

Theorem 1. If the realizations of the estimators Zdom and Zres are generated respectively fromthe measures P(·) and Pθ(·), and if we let,

Z(n, b) := Zdom(n, b) + Zres(n, b),

then under Assumption 1, the family of estimators (Z(n, b) : n ≥ 1, b > nβ+ǫ) achieves asymp-totically vanishing relative error for the estimation of PSn > b, as n ր ∞; that is,

Var [Z(n, b)]

(PSn > b)2= o(1),

as n ր ∞, uniformly for b > nβ+ǫ.

9

Proof. Since the realizations of Zdom and Zres are generated independent of each other, thevariance of Z is just the sum of variances of Zdom and Zres; the proof is now evident from (12),(16) and (5).

Remark 1. A consequence of the above theorem is that, due to (1), the number of i.i.d. repli-cations of Z(n, b) required to achieve ǫ-relative precision with probability at least 1 − δ is atmost o(ǫ−2δ−1), independent of the rarity parameters n and b. In our algorithm each replicationdemands O(n) computational effort, thus requiring a overall computational cost of O(n), asn ր ∞.

Remark 2. One can easily check that, this same simulation procedure can also be used toefficient computate probabilities P (SN > b) when N is a random variable independent of Xi.

4. Simulation Methodology for τb < ∞

As before, the sequence (Sn : n ≥ 0) with S0 := 0 and Sn := X1 + . . . + Xn representsthe random walk associated with the i.i.d collection (Xn : n ≥ 1). We have EXn = 0, andPXn > x = x−αL(x) for some slowly varying function L(·) and α > 1. Given µ > 0,M := supn(Sn −nµ). Since (Sn −nµ : n ≥ 0) is a random walk with negative drift, the randomvariable M is proper. For b > 0, recall that the first-passage time τb := infn ≥ 0 : Sn−nµ > b.In this section we present simulation methods for the efficient computation of

PM > b = Pτb < ∞, as b ր ∞.

Naive simulation of τb < ∞ will require generation of all the increments until the maximumof the partial sums exceed b. Due to the negative drift of the random walk (Sn − nµ : n ≥ 0),we have τb ր ∞ a.s. as b ր ∞, and hence this method is not computationally feasible. Tocounter the prospect of generating uncontrollably large number of increment random variablesin simulation, we re-express Pτb < ∞ as below: Consider a strictly increasing sequence ofintegers (nk : k ≥ 0) with n0 = 0; also fix p := (pk : k ≥ 1) satisfying pk > 0 for all k and∑

k pk = 1; the vector p can be seen as a probability mass function on positive integers. Ifwe consider an auxiliary random variable K which takes the value of positive integer k withprobability pk, then we can write,

Pτb < ∞ =∑

k≥1

pkPnk−1 < τb ≤ nk

pk

= E

[

PnK−1 < τb ≤ nKpK

]

, (17)

where E[f(K)] =∑

k≥1 pkf(k), for any f : Z+ → R.Now in a simulation run, if the realized value of the auxiliary random variable K is k,

generate a sample from a probability measure, possibly different from P(·), of a random variableZk that has Pnk−1 < τb ≤ nk as its expectation under the changed measure. Then equation(17) assures us that repeated simulation runs involving generation ofK according to the measureinduced by p, and taking sample mean of such realizations of ZK/p

Kfollowing the changed

measure will yield an unbiased estimator for the quantity Pτb < ∞.The performance of any importance sampling algorithm following the outlined procedure

will depend crucially on the choice of probabilities pk, and the change of measure employed toestimate Pnk−1 < τb ≤ nk, for k ≥ 1. The sequence (nk : k ≥ 0) partitions non-negative

10

integers into ‘blocks’ ((nk−1, nk] : k ≥ 1). For reasons that will be clear later, we choose theblocks (nk−1, nk] in the following manner: Fix a positive integer r > 1 and let,

n0 = 0, nk = rk, for k ≥ 1.

In the following section, we detail the importance sampling schemes for the efficient compu-tation of the quantities Pnk−1 < τb ≤ nk, k ≥ 1; these will be used as building blocks toefficiently compute the ultimate object of interest Pτb < ∞.

4.1. Efficient simulation of nk−1 < τb ≤ nk. In this section we present our state-independent importance sampling procedure, for the efficient computation of the probabilitiesPnk−1 < τb ≤ nk, that are uniformly efficient for k ≥ 1. Define the following events:

Ak =

nk⋃

i=nk−1+1

Xi > b+ iµ and Bk =

nk⋂

i=1

Xi < b+ nk−1µ .

The events Ak and Bk are defined in the same spirit as that of Adom and Ares in the simulationof Sn > b in Section 3: the event Ak includes sample paths that have at least one “big” jumpof appropriate size in one of the increments indexed between nk−1 and nk, whereas on the otherset Bk we have all the increments bounded from above. The following lemma, proved in theappendix, asserts that asymptotically Ak is the most likely way for the event nk−1 < τb ≤ nkto happen.

Lemma 2. For any ǫ > 0, there exists bǫ such that for all b > bǫ,

supk≥1

∣

∣

∣

∣

Pnk−1 < τb ≤ nk, AkPnk−1 < τb ≤ nk

− 1

∣

∣

∣

∣

< ǫ.

As in the simulation of large deviation probabilities of sums of random variables in Section 3,we can partition the event nk−1 < τb ≤ nk into:

nk−1 < τb ≤ nk, Ak, nk−1 < τb ≤ nk, Bk and nk−1 < τb ≤ nk, Ak ∩ Bk,

and arrive at unbiased estimators for their probabilities separately via different importancesampling measures.

4.1.1. Simulating nk−1 < τb ≤ nk, Ak. Let qk(b) :=∑nk

i=nk−1+1 F (b + iµ). We prescribethe following two step procedure:

1. Choose an index J ∈ nk−1 + 1, . . . , nk such that PrJ = n = F (b+nµ)qk(b)

, for nk−1 < n ≤nk.

2. Simulate the increment Xn from F (·|Xn ≥ b+nµ), if n = J ; otherwise, simulate Xn fromF (·), for any n ≤ nk.

In this sampling procedure, we induce the ‘big’ jumps typically responsible for the occurrenceof nk−1 < τb ≤ nk with suitable probabilities by sampling from the conditional distributionF (·|XJ ≥ b+ Jµ). This sampling procedure results in the importance sampling measure Pk,1(·)characterised by:

dPk,1(x1, . . . , xnk) :=

nk∑

i=nk−1+1

F (b+ iµ)

qk(b).dF (x1) . . . dF (xnk

)

F (b+ iµ)1(xi ≥ b+ ia).

11

This in turn yields a likelihood ratio,

dP

dPk,1(X1, . . . ,Xnk

) =qk(b)

#Xi ≥ b+ iµ : nk−1 < i ≤ nk,

on the set Ak. Then we have,

Zk,1(b) :=qk(b)

#Xi ≥ b+ iµ : nk−1 < i ≤ nkI(nk−1 < τb ≤ nk, Ak) (18)

as the unbiased estimator for the quantity Pnk−1 < τb ≤ nk, Ak. Here note that I(nk−1 <τb ≤ nk, Ak) = 1 a.s. under Pk,1.

Lemma 3. Uniformly for k ≥ 1,

Var [Zk,1(b)] = o(

(Pnk−1 < τb ≤ nk)2)

, as b ր ∞.

Proof. Since the quantity #Xi ≥ b+ iµ : nk−1 < i ≤ nk is at least 1 when the increments aregenerated from Pk,1(·),

Zk,1(b) ≤ qk(b),

and hence,

Ek,1

[

Z2k,1

]

≤ q2k(b). (19)

We have,

qk(b) =

nk∑

i=nk−1+1

F (b+ iµ) ≤nk∑

i=nk−1+1

∫ i

i−1F (b+ uµ)du =

∫ nk

nk−1

F (b+ uµ)du.

Changing variables from u to v = b+ uµ gives,

qk(b) ≤1

µ

∫ b+nkµ

b+nk−1µF (v)dv.

Now given ǫ > 0, because of (8) and (19),

Ek,1

[

Z2k,1

]

≤ (1 + ǫ) (Pnk−1 < τb ≤ nk)2

for all k and b large enough. Also,

Pnk−1 < τb ≤ nk, Ak ≥ (1− ǫ)Pnk−1 < τb ≤ nk,

for all k, because of Lemma 2. Therefore,

Var [Zk,1(b)] ≤(

1 + ǫ− (1− ǫ)2)

(Pnk−1 < τb ≤ nk)2

≤ 3ǫ (P nk−1 < τb ≤ nk)2 .

12

4.1.2. Simulating nk−1 < τb ≤ nk, Bk. On the event Bk, none of the random variablesX1, . . . ,Xnk

exceed the level (b+ nk−1µ); since these increments are bounded (on Bk), we candraw their samples from an appropriately truncated, exponentially twisted variation of F (·)without losing absolute continuity on nk−1 < τb ≤ nk, Bk. For estimating Pnk−1 < τb ≤nk, Bk, we draw samples ofX1, . . . ,Xτb∧nk

independently from the distribution Fk(·) satisfying,

dFk(x)

dF (x)= exp(θkx− Λk(θk))1(x < b+ nk−1µ), x ∈ R;

here, θk(= θk(b)) :=− log(nkF (b+ nk−1µ))

b+ nk−1µ, and (20)

Λk(θ) := log

(∫ b+nk−1µ

−∞exp(θkx)F (dx)

)

, θ ≥ 0. (21)

Let Pk,2(·) be the measure induced by drawing samples as above. Then the resulting likelihoodratio on nk−1 < τb ≤ nk, Bk is:

dP

dPk,2(X1, . . . ,Xnk

) = exp (−θkSτb + τbΛk(θk)) .

The associated estimator for computing Pnk−1 < τb ≤ nk, Bk is:

Zk,2(b) := exp (−θkSτb + τbΛk(θk)) I(nk−1 < τb ≤ nk, Bk) (22)

The following uniform bounds, which help in analyzing the variance of the estimator Zk,2, areproved in the appendix.

Lemma 4. For all values of k and b, there exists a positive constant c1 such that,

exp(nkΛk(θk)) ≤ c1.

Lemma 5. For all values of k and b, there exists a positive constant c2 such that,

nkF (b+ nk−1µ)

Pnk−1 < τb ≤ nk≤ c2.

Using these results, we now present an asymptotic analysis on the variance of the estimatorsZk,2(·).

Lemma 6. Uniformly for k ≥ 1,

Var [Zk,2(b)] = o(

(Pnk−1 < τb ≤ nk)2)

, as b ր ∞.

Proof. Since nk/r < τb ≤ nk on the event nk−1 < τb ≤ nk,

exp (τbΛk(θk)) I(nk−1 < τb ≤ nk, Bk) ≤ c1 ∨ cr−1

1 =: c

Resolve c1, c. because of Lemma 4. Further note that θkSτb ≥ − log(nkF (b + nk−1µ)) onnk−1 < τb ≤ nk. Therefore from (22),

Zk,2(b) ≤ c(

nkF (b+ nk−1µ))

I(nk−1 < τb ≤ nk, Bk), for all k.

13

Now, changing the expectation operator in the evaluation of second moment of the estimator,results in the following bound:

Ek,2

[

Z2k,2(b)

]

= E [Zk,2(b)]

≤ c(

nkF (b+ nk−1µ))

Pnk−1 < τb ≤ nk, Bk.

Here we apply Lemma 2 for a bound on the probability term in the above expression. Givenǫ > 0, for all b large enough, we have:

Ek,2

[

Z2k,2(b)

]

≤ c(

nkF (b+ nk−1µ))

(ǫPnk−1 < τb ≤ nk) , for all k.

Now using Lemma 5 we obtain,

Var [Zk,2(b)] ≤ ǫc.c2 (P nk−1 < τb ≤ nk)2 .

4.1.3. Simulating nk−1 < τb ≤ nk, Ak ∩ Bk. We draw samples in a two step proceduresimilar to that in the Section 4.1.1.

1. Choose an index J uniformly at random from 1, . . . , nk

2. Simulate the increment Xn from F (·|Xn ≥ b + nk−1µ), if n = J ; otherwise, simulate Xn

from F (·), for any n ≤ nk.

If Pk,3 denotes the change of measure induced by drawing samples according to the aboveprocedure, then the likelihood ratio on the set nk−1 < τb ≤ nk, Ak ∩ Bk is:

dP

dPk,3(X1, . . . ,Xnk

) =nkF (b+ nk−1µ)

#Xi ≥ b+ nk−1µ : 1 < i ≤ nk.

The resulting estimator for the computation of Pnk−1 < τb ≤ nk, Ak ∩ Bk is:

Zk,3(b) :=nkF (b+ nk−1µ)

#Xi ≥ b+ nk−1µ : 1 < i ≤ nkI(

nk−1 < τb ≤ nk, Ak ∩ Bk

)

. (23)

Similar to Lemmas 3 and 6, we have the following result on the variance of Zk,3(·) :Lemma 7. Uniformly for k ≥ 1,

Var [Zk,3(b)] = o(

(Pnk−1 < τb ≤ nk)2)

, as b ր ∞.

Proof. When the increments are generated as prescribed in the above two-step procedure, wehave #Xi ≥ b+ nk−1µ : 1 < i ≤ nk ≥ 1, and hence,

Zk,3(b) ≤ nkF (b+ nk−1µ)I(


)

.

Now a bound on the second moment of the estimator can be obtained as before:

Ek.3

[

Z2k,3(b)

]

= E [Zk.3(b)]

≤ nkF (b+ nk−1µ)P


.

Given ǫ > 0, due to application of Lemma 2, for all k ≥ 1 and b large enough, we have:

Ek.3

[

Z2k,3(b)

]

≤ nkF (b+ nk−1µ) (ǫP nk−1 < τb ≤ nk) .

Using Lemma 5, we write,

Var [Zk,3(b)] ≤ ǫc2 (P nk−1 < τb ≤ nk)2 ,

thus establishing the claim.

14

The estimator for Pnk−1 < τb ≤ nk can be obtained by summing the estimators ofPnk−1 < τb ≤ nk, Ak,Pnk−1 < τb ≤ nk, Bk, and Pnk−1 < τb ≤ nk, Ak ∩ Bk :

Zk(b) := Zk,1(b) + Zk,2(b) + Zk,3(b).

Since the random variables Zk,j(b), j = 1, 2, 3 are generated independent of each other,

Var [Zk(b)] = Var [Zk,1(b)] + Var [Zk,2(b)] + Var [Zk,3(b)]

= o(

(P nk−1 < τb ≤ nk)2)

,

uniformly for k ≥ 1, as b ր ∞, because of Lemmas 3, 6, and 7. This yields the followingtheorem:

Theorem 2. The family of estimators Zk(b); k ≥ 1, b > 0 achieves asymptotically vanishingrelative error for the unbiased estimation of Pnk−1 < τb ≤ nk, uniformly in k, as b ր ∞; thatis,

supk≥1

Var [Zk(b)]

(Pnk−1 < τb ≤ nk)2= o(1),

as b ր ∞.

Remark 3. For our choice of importance sampling measures, the likelihood ratios resulting inthe simulation of nk−1 < τb ≤ nk, Bk and nk−1 < τb ≤ nk, Ak ∩ Bk are O(nkF (b+nk−1µ)).To have vanishing relative error, we need Pnk−1 < τb ≤ nk to be of the same order, whichhappens when the choice of (nk : k ≥ 0) is geometric, as shown in Lemma 5.

4.2. Simulation of τb < ∞ - the finite variance case. Here we develop on the ideasstated at the beginning of Section 4. We have the increasing sequence of integers (nk : k ≥ 0),

n0 = 0, nk = rk for k ≥ 1,

for some integer r > 1. Further, we have an auxiliary random variableK taking values in positiveintegers according to the probability mass function (pk : k ≥ 1). As in (17), we re-express thequantity of interest as:

Pτb < ∞ = E

[


]

.

From Section 4.1, we have estimators Zk(b) : k ≥ 1 that can be used to compute the cor-responding probabilities Pnk−1 < τb ≤ nk : k ≥ 1 in an efficient manner. Consider thefollowing simulation procedure:

1. Draw a sample of K such that PrK = k = pk.

2. Generate a realization of ZK(b) as in Section 4.1.

3. Return ZK(b)pK

.

We present the sample mean of the values returned by N independent simulation runs of theabove procedure as our final estimate of Pτb < ∞. Let Q(·) denote the probability measure inthe path space induced by the generation of increment random variables as a result of one runof this sampling procedure; let E

Q[·] be the expectation operator associated with Q(·). Givenb > 0, the overall unbiased estimator for the computation of Pτb < ∞ is,

Z(b) :=ZK(b)

pK

.

15

Note that the number of independent simulation runs needed to achieve a desired relativeprecision, as in (1), is directly related to the sampling variance of Z(b). If (Z(b) : b > 0) offerasymptotically vanishing relative error, we just need o(ǫ−2δ−1) independent replications of theestimator. However, as pointed in Hammersley and Handscomb (1965), and further justified inGlynn and Whitt (1992), both the variance of an estimator and the expected computationaleffort required to generate a single sample are important performance measures, and theirproduct can be considered as a ‘figure of merit’ in comparing performance of algorithms thatprovide unbiased estimators of Pτb < ∞. For any given b, let νb denote the largest indexof the increment random variables (Xis) considered for simulation in a particular simulationrun. The expectation of νb, then gives a measure of the expected number of increment randomvariables generated, and subsequently of the expected computational effort in every simulationrun. In particular, the latter may be bounded from above by a a constant C > 0 times theexpectation of νb.

In a single run of the above procedure, if the realized value of K is k, we look for estimatingPnk−1 < τb ≤ nk which does not entail the generation of more than nk increment randomvariables, thus ensuring termination. In particular, n

K−1≤ νb ≤ n

K. The following theorems

give a measure of both the variance and the expected computational effort per replicaton ofZ(b) for a specific choice of the probabilities pk:

Theorem 3. For

pk =FI(b+ nk−1µ)− FI(b+ nkµ)

FI(b), k ≥ 1, (24)

the family of unbiased estimators (Z(b) : b > 0) achieves asymptotically vanishing relative errorfor the computation of Pτb < ∞, as b ր ∞; that is:

limb→∞

VarQ [Z(b)]

Pτb < ∞2 = 0.

Theorem 4. If F (·) is regularly varying with index α > 2, for the choice of p = (pk : k ≥ 1) in(24):

EQ[νb] ≤r + o(1)

µ(α− 2)b, as b ր ∞.

Proofs of both these results are given later in Section 5.

Remark 4. From Theorem 3, we have the vanishing relative error property for computing Pτb <∞ whenever the increment random variables Xn have finite mean (irrespective of the variance).Therefore we require only o(ǫ−2δ−1) i.i.d replications of Z(b) to arrive at estimators that differrelatively at most by ǫ with probability at least 1− δ. Now from Theorem 4 we conclude that,if the tail index α > 2 (in which case the increments have finite variance), our importancesampling methodology estimates Pτb < ∞ in O(b) expected computational effort.

Remark 5. From the conditional limit result in (9), one can infer that the values pk as in (24)match the zero-variance probability Pnk−1 < τb ≤ nk|τb < ∞ asymptotically. For tails F (·)with regularly varying index 1 < α < 2, we have that E[τb|τb < ∞] = ∞; that is, the zero-variance measure itself has infinite expected termination time! Since pk are assigned a valuesimilar to Pnk−1 < τb ≤ nk|τb < ∞, one might suspect infinite expected termination time fora single run of Algorithm 1 as well. As we note later in Remark 9 after proof of Theorem 4, forpks as in (24), this is indeed the case.

16

4.3. Simulation of τb < ∞ - the infinite variance case. As in Remark 5, infinitetermination time for a simulation algorithm is clearly unacceptable. The following question thenis natural: By choosing pks differently, even if it means compromising on estimator variance, canone achieve finite expected termination time for the procedure in Section 4.2? Before answeringthis question below, we introduce a family of tail distributions and their integrated counterparts:for any β > 2, define

G(β)(x) :=F (x)

xβ−α, and G

(β)I (x) :=

∫ ∞

xG(β)(u)du. (25)

Theorem 5. If the tail F (·) is regularly varying with index α ∈ (1.5, 2], then for any β ∈(2, 2α − 1),

pk =G

(β)I (b+ nk−1µ)− G

(β)I (b+ nkµ)

G(β)(b), k ≥ 1 (26)

yields a family of unbiased estimators (Z(b) = ZK(b)/p

K: b > 0) achieving

1. strong efficiency: limb→∞VarQ[Z(b)]Pτb<∞2 < ∞, and

2. finite expected termination time: EQ[νb] ≤ r+o(1)

µ(β−2)b, as b ր ∞.

Remark 6. Because of the strong efficiency, we need just O(ǫ−2δ−1) i.i.d. replications of Z(b) toachieve the desired relative precision. As in Remark 4, due to the bound on E[νb] in Theorem5, the average computational effort for the entire estimation procedure is just O(ǫ−2δ−1b). It isimportant to see this achievement in the context of Remark 5: the induced measureQ(·) deviatesfrom the zero-variance measure such that we get finite expected termination time, but only atthe cost of losing vanishing relative error property to strong efficiency. Thus for the selection ofpks as in (26), the suggested procedure ends up offering a vastly superior performance (in termsof computational complexity) compared to the zero-variance change of measure.

Given this result, it is difficult not to wonder why the tail index α should be larger than 1.5in the statement of Theorem 5, and what happens when α ≤ 1.5. The following result showsthat it is indeed not possible to have both strong efficiency and finite expected termination timewhen the tail index α < 1.5.

Theorem 6. If the tail index α < 1.5, there does not exist an assignment of (pk, nk : k ≥ 1)such that both E

Q[Z2(b)] and EQ[νb] are simultaneously finite.

Remark 7. If the tail index α = 1.5, the possibility of having both EQ[Z2(b)] and E

Q[νb] finitewill depend on the slowly varying function L(·). As we see in the proof of Proposition 6,

EQ[Z2(b)]EQ[νb] = Ω

(∫ ∞

b2

√uF (u)du)

)

,

as b ր ∞. If L(x) = O((log x)−m), m ≥ 2, the above integral is finite, whereas if L(x) = O(log x)it is infinite; and it easily verified that the case of L(x) = O((log x)−m), m ≥ 2, goes throughthe proof of Theorem 5, thus achieving both strong efficiency and finite expected terminationtime. This illustrates the subtle dependence on the associated slowly varying function L(·) forthe existence of such pks and nks.

As illustrated by Theorem below, for α ∈ (1, 1.5], we still have algorithms that demand onlyO(b) units of expected computer time if we look for less stringent notions of efficiency.

17

Theorem 7. If the tail F (·) is regularly varying with index α ∈ (1, 1.5], then there exists anexplicit selection of p = (pk : k ≥ 1) such that the family of unbiased estimators (Z(b) : b > 0)satisfies both:

limb→∞

EQ[

Z1+γ(b)]

Pτb < ∞1+γ< ∞ for all γ ∈

(

0,α− 1

2− α

)

, and (27)

EQ[νb] ≤ Cb for some constant C.

In particular, for the following selection of p = (pk : k ≥ 1),

pk =G

(β)I (b+ nk−1µ)− G

(β)I (b+ nkµ)

G(β)(b), k ≥ 1 (28)

if β is chosen in (2, α + γ−1(α− 1)), both the above inequalities are satisfied.

Remark 8. If the estimator Z(b) satisfies (27), similar to how we arrived at (1), it can beshown that O(ǫ−(1+γ−1)δ−γ−1

) i.i.d. replications of Z(b) are enough to produce estimates havingrelative deviation at most ǫ with probability at least 1 − δ. Now according to Theorem 7, theexpected termination time in each replication is O(b). Thus with the pks chosen as in (28), weexpend just O(ǫ−(1+γ−1)δ−γ−1

b) units of computer time on an average, which is still linear in b.The price we pay by not adhering to strong efficiency is the worse dependence on the parametersǫ and δ.

It is further interesting to note that a vastly different state-dependent methodology devel-oped using Lyapunov inequalities in Blanchet and Liu (2012) also hits identical barriers andprovides results similar to ours: They present algorithms that are both strongly efficient andpossess O(b) expected termination time for the case of tails having index α > 1.5; whereas whenα ∈ (1, 1.5], they provide estimators satisfying (27) along with O(b) expected termination timeof a simulation run.

5. Proofs of key theorems

5.1. Proof of Theorem 3. Recall that the overall estimator Z(b) = ZK(b)/pK , where pk isas in (24). Second moment of the estimator Z(b) is bounded as below:

EQ[Z2(b)] = E

Q

[

(

ZK(b)

pK

)2]

= EQ

[

EQ

[

Z2K(b)

p2K

∣

∣

∣K

]]

= EQ

[

EQ

[

Z2K(b)

PnK−1 < τb ≤ nK2 · PnK−1 < τb ≤ nK2p2K

∣

∣

∣K

]]

(29)

Given ǫ > 0 and large enough b, (8) and (24) give us:


≤ 1 + ǫ

µFI(b).

Also from Theorem 2, we have:

EQ

[

Z2K(b)

PnK−1 < τb ≤ nK2∣

∣

∣K

]

≤ 1 + ǫ,

18

for values of b sufficiently large. Then from (29) and (8),

EQ[

Z2(b)]

≤ (1 + ǫ)3

µ2F 2I (b)

≤ (1 + ǫ)4Pτb < ∞2,

thus proving the asymptotically vanishing relative error property.

5.2. Proof of Theorem 4. Recall that νb denotes the maximum of indices of the incrementrandom variables (Xis) considered for simulation in a particular simulation run. From thesampling procedures in Section 4.1, it is clear that νb ≤ n

K. Therefore,

EQ[νb] ≤

∑

k≥1

pknk

= rp1 +∑

k≥2

rkpk

=1

FI(b)

r

∫ b+rµ

bF (u)du +

∑

k≥1

rk+1

∫ b+rk+1µ

b+rkµF (u)du

. (30)

Since rk∫ b+rk+1µ

b+rkµF (u)du =

b+ rkµ− b

µ

∫ b+rk+1µ

b+rkµF (u)du

≤ 1

µ

(

∫ b+rk+1µ

b+rkµuF (u)du − b

∫ b+rk+1µ

b+rkµF (u)du

)

,

∑

k≥1

rk+1

∫ b+rk+1µ

b+rkµF (u)du ≤ r

µ

∑

k≥1

∫ b+rk+1µ

b+rkµuF (u)du− b

∫ b+rk+1µ

b+rkµF (u)du

=r

µ

(∫ ∞

b+rµuF (u)du−

∫ ∞

b+rµF (u)du

)

(31)

≤ r + o(1)

µ

(

(b+ rµ)2

α− 2− b

b+ rµ

α− 1

)

F (b+ rµ), as b ր ∞

=r + o(1)

µ(α− 1)(α− 2)b2F (b),

where the penultimate step follows from Karamata’s theorem (see (10)), and the final step just

uses long-tailed nature of F (·). Also note that:∫ b+rµb F (u)du ≤ rµF (b), and by application of

Karamata’s theorem, we have FI(b) ∼ bF (b)α−1 , as b ր ∞. Therefore from (30),

EQ[νb] ≤

r + o(1)

µ(α− 2)b, as b ր ∞,

thus yielding the required bound on the expected termination time.

Remark 9. Similar to how we arrived at (31), lower bounds can be obtained to show thatEQ[νb] = Ω

(∫∞b uF (u)du

)

. If the tail index α < 2,∫∞b uF (u)du turns out to be infinite, and

subsequently EQ[νb] = ∞. Though the assignment of pks in (24) yields vanishing relative error

19

for any α > 1, it fails to provide algorithms which have finite expected termination time whenthe increment random variables X have infinite variance (e.g., when α < 2), thus making thischoice of pk not suitable for practice.

5.3. Proof of Theorem 5.

1. Variance of Z(b): Since Q(K = k) = pk,

EQ[Z2(b)] = E

Q

[

Z2K(b)

p2K

]

=∑

k

pkEQ[Z2

k(b)]

p2k

=∑

k

EQ[Z2

k(b)]

Pnk−1 < τb ≤ nk2Pnk−1 < τb ≤ nk2

pk.

Thanks to the uniformly efficient estimators developed in Section 4.1, Theorem 2 helps usto write:

EQ[

Z2(b)]

≤ (1 + ǫ)2∑

k

Pnk−1 < τb ≤ nk2pk

, (32)

for large values of b. Due to the uniform convergence in (8), and the assignment of pks asin (26), we can write,

Pnk−1 < τb ≤ nkpk

≤ (1 + ǫ)FI(b+ nk−1µ)− FI(b+ nkµ)

G(β)I (b+ nk−1µ)− G

(β)I (b+ nkµ)

G(β)I (b),

uniformly in k. Note that,

FI(b+ nk−1µ)− FI(b+ nkµ) =

∫ b+nkµ

b+nk−1µF (u)du ≤ (nk − nk−1)µF (b+ nk−1µ),

G(β)I (b+ nk−1µ)− G

(β)I (b+ nkµ) =

∫ b+nkµ

b+nk−1µG(β)(u)du ≥ (nk − nk−1)µG

(β)(b+ nkµ), and

G(β)(b+ nk−1µ)

G(β)(b+ nkµ)≤ G(β)(b+ nk−1µ)

G(β)(r(b+ nk−1µ))≤ (1 + ǫ)rβ,

due to the regularly varying nature of G(β)(·). Therefore for values of b sufficiently large,


≤ (1 + ǫ)F (b+ nk−1µ)

G(β)(b+ nkµ)

G(β)(b+ nkµ)

G(β)(b+ nk−1µ)G

(β)I (b)

= (1 + ǫ)2rβ(b+ nk−1µ)β−αG

(β)I (b), (33)

for all k, because F (x)

G(β)(x)= xβ−α. Then from (32)

EQ[

Z2(b)]

≤ (1 + ǫ)4rβG(β)I (b)

∑

k

(b+ nk−1µ)β−α

Pnk−1 < τb ≤ nk

≤ (1 + ǫ)5rβG(β)I (b)

∑

k

(b+ nk−1µ)β−α

∫ b+nkµ

b+nk−1µF (u)du,

20

because of (8). Consequently,

EQ[

Z2(b)]

≤ (1 + ǫ)5rβG(β)I (b)

∑

k

∫ b+nkµ

b+nk−1µuβ−αF (u)du

≤ (1 + ǫ)5rβG(β)I (b)

∫ ∞

buβ−αF (u)du

≤ (1 + ǫ)6rβG(β)I (b)bβ−α+1 F (b)

2α− β − 1,

for large enough values of b due to Karamata’s theorem if 2α − β > 1. This is indeedtrue because β is assumed smaller than 2α − 1 in the statement of Theorem 5. Further

(α− 1)FI(b) ∼ bF (b) and bβ−αG(β)I (b) ∼ FI(b), as b ր ∞. Therefore,

limb→∞

EQ[Z2(b)]

F 2I (b)

≤ (α− 1)rβ

2α − β − 1< ∞.

Now since Pτb < ∞ ∼ µ−1FI(b), we have strong efficiency.

2. Expected termination time: Since νb ≤ nK,EQ[νb] ≤ E

Q[nK] =

∑

k pknk. For the choice ofpk in (26), following exactly the same steps in the proof of Theorem 4, we arrive at:

EQ[νb] ≤

r

µ

(

µ

∫ b+rµ

bG(β)(u)du+

∫ ∞

b+rµuG(β)(u)du− b

∫ ∞

b+rµG(β)(u)du

)

.

Since G(β)(·) is regularly varying with tail index larger than 2, by application of Kara-mata’s theorem, we have:

∫ ∞

b+rµuG(β)(u)du ∼ (b+ rµ)2

β − 2G(β)(b+ rµ),

which would not have been the case if we had persisted with using FI(·) instead of G(β)I (·)

for pk. Again following the remaining steps in the proof of Theorem 4, we conclude that:

EQ[νb] ≤

r + o(1)

µ(β − 2)b, as b ր ∞,

thus yielding finite termination time even when the zero-variance measure fails to offerthis desirable property.

5.4. Proof of Theorem 6. Since Q(K = k) = pk, see that:

EQ[Z2(b)] = E

Q

[

Z2K(b)

p2K

]

=∑

k

EQ[Z2

k(b)]

pk≥∑

k


,

because of Jensen’s inequality. To arrive at a contradiction, let us assume that both EQ[Z2(b)]

and EQ[νb] are finite. Then,

EQ[Z2(b)]EQ[νb] ≥

(

∑

k


)(

∑

k

pknk

)

≥(

∑

k

Pnk−1 < τb ≤ nk√pk

· √pknk

)2

=

(

∑

k

√nkPnk−1 < τb ≤ nk

)2

. (34)

21

where the penultimate step follows from Cauchy-Schwarz inequality. Due to the uniform con-vergence in (8), given ǫ > 0, for b large enough:

∑

k

√nkPnk−1 < τb ≤ nk ≥ (1− ǫ)

∑

k

√nk

∫ nkµ

nk−1µF (b+ u)du

≥ 1− ǫ√µ

∑

k

∫ nkµ

nk−1µ

√uF (b+ u)du

=1− ǫ√

µ

∫ ∞

0

√uF (b+ u)du.

Now it can be seen easily that the RHS is finite only when α ≥ 1.5, via the following change ofvariable and the subsequent integration of the resulting regularly varying tail:

∫ ∞

0

√uF (b+ u)du =

∫ ∞

b

√u− bF (u)du

≥∫ ∞

b2

√u ·√

1− b

uF (u)du

≥√

1− 1

b

∫ ∞

b2

√uF (u)du,

which cannot be finite if α < 1.5, thus arriving at the desired contradiction. Therefore from(34), we conclude that we cannot have both the second moment of Z(b) and the expected ter-mination time E

Q[νb)] to be simultaneously finite if the tail index α < 1.5.

5.5. Proof of Theorem 7. The proof is similar to that of Theorem 5, and we provide onlyan outline of the steps involved. Since Q(K = k) = pk, as in (32),

EQ[Z1+γ(b)] ≤ (1 + ǫ)1+γ

∑

k

(


)γ

Pnk−1 < τb ≤ nk,

for sufficiently large values of b. Then using (33) and (8),

EQ[Z1+γ(b)] ≤ (1 + ǫ)1+3γ

(

G(β)I (b)

)γ∑

k

(b+ nk−1µ)γ(β−α)


≤ (1 + ǫ)2+3γ(

G(β)I (b)

)γ∫ ∞

buγ(β−α)F (u)du,

which follows from the routine calculation in the proof of Theorem 5. Since β is smaller thanα+γ−1(α−1) as in the statement of Theorem 7, the tail index of the integrand, α−γ(β−α) > 1.Therefore we can apply Karamata’s theorem to conclude that for values of b large enough,

EQ[Z1+γ(b)] ≤ (1 + ǫ)3+3γ

(

G(β)I (b)

)γ bγ(β−α)+1

α− γ(β − α)− 1F (b).

Now observing that (α − 1)FI(b) ∼ bF (b), bβ−αG(β)I (b) ∼ FI(b), and Pτb < ∞ ∼ µ−1FI(b) as

b ր ∞, we have:

limb→∞

EQ[Z1+γ(b)]

Pτb < ∞1+γ≤ µ2(α− 1)

α− γ(β − α)− 1< ∞.

Since β is ensured to be larger than 2, the same proof for EQ[νb] = O(b) goes through.

22

6. Numerical Experiments

In this section, we present the results of numerical simulation experiments performed on exam-ples previously considered in literature, and compare the performance of our algorithms.

6.1. Example 1 - estimation of PSn > b. Take X = ΛR, where PΛ > x = 1 ∧x−4, R ∼ Laplace(1), and Λ is independent of R. We use N = 10, 000 simulation runs toestimate PSn > n for n = 100, 500 and 1000. In Table 1, we compare the numerical estimatesobtained by our simulation procedure with the true values of PSn > n evaluated in Blanchetand Liu (2008) via inverse transform techniques; further, a comparison of performance of ourmethodology with Algorithms 1 and 2 in Blanchet and Liu (2008) (referred to as BL1 andBL2) has also been presented. From the columns CV, CV of BL1, and CV of BL2, it can beinferred that our state-independent simulation procedures yield estimators with substantiallylower coefficient of variation throughout the range of values considered.

n PSn > n Estimate (z)for PSn > n

Std. error CV of z CV of BL1 CV of BL2

100 2.21×10−5 2.17×10−5 4.31×10−7 1.97 10.3 4.7500 1.04×10−7 1.05×10−7 6.91×10−10 0.66 1.0 4.11000 1.25×10−8 1.29×10−8 6.91×10−11 0.53 1.1 3.8

Table 1: Numerical result for Example 1 - here Std. error denotes the standard deviation of theestimator of PSn > n based on 10,000 simulation runs; CV denotes the empirically observedcoefficient of variation

6.2. Example 2 - estimation of Pτb < ∞. To facilitate comparison with existing methods,we use the following example from Blanchet and Glynn (2008): Consider an M/G/1 queue withtraffic intensity ρ = 0.5 and Pareto service times having tail PV > t = (1 + t)−2.5. Theaim is to estimate the probability that this queue develops a waiting time b in stationarity byequivalently estimating the level crossing probabilities Pτb < ∞ of the associated negativedrift random walk. For this example, we use the simulation procedures discussed in Section 4and compare the results with that of the existing algorithms in literature in Table 2. WhileAlgorithms AK (in Asmussen and Kroese (2006)) and DLW (in Dupuis et al. (2007)) restrict thearrivals to be Poisson, the schemes BGL, BG and BL referring to the algorithms, respectively, inBlanchet et al. (2007), Blanchet and Glynn (2008) and Blanchet and Liu (2012) do not imposeany such restriction.

In our implementation, r has been chosen to be 2 to keep the expected termination timelow, as suggested by Theorem 4. The results reported in Table 2 correspond to the simulationestimates of Pτb < ∞ for values of b = 102, 103 and 104 using N = 10, 000 simulation runs.From Table 2, it can be inferred that the error offered by the estimates of our simpler state-independent procedure is much smaller when compared with other existing algorithms. Table 3gives a comparison of coefficient of variation of the estimators empirically observed for differentvalues of r, and a fixed b = 103. It can be seen from Table 3 as well that choosing r = 2 helpsin keeping the relative error low.

7. Conclusion

In this paper we revisited the problem of efficient simulation of commonly encountered rare eventprobabilities associated with random walks having regularly varying heavy-tailed increments.

23

EstimationStd. error b = 102 b = 103 b = 104

CV

9.75 × 10−4 3.15 × 10−5 9.98 × 10−7

Proposed 4.11 × 10−6 7.89 × 10−8 1.39 × 10−9

method 0.42 0.25 0.141.20 × 10−3 3.15 × 10−5 9.98 × 10−7

AK 1.48 × 10−5 2.19 × 10−7 6.95 × 10−9

1.23 0.70 0.701.05 × 10−3 3.16 × 10−5 9.91 × 10−7

DLW 5.20 × 10−6 1.69 × 10−7 2.99 × 10−9

0.50 0.53 0.301.02 × 10−3 3.17 × 10−5 1.13 × 10−6

BGL 3.84 × 10−5 1.60 × 10−6 7.28 × 10−8

3.76 5.05 6.441.08 × 10−3 3.15 × 10−5 9.98 × 10−7

BG 5.97 × 10−6 9.73 × 10−8 2.07 × 10−9

0.55 0.31 0.211.05 × 10−3 3.18 × 10−5 9.88 × 10−7

BL 3.76 × 10−5 2.60 × 10−7 8.19 × 10−9

3.58 0.82 0.83

Table 2: Numerical result for Example 2 - here Std. error denotes the standard deviation of theestimator of Pτb < ∞ based on 10,000 simulation runs; CV denotes the empirically observedcoefficient of variation

r Estimate Std. error CV

2 3.15×10−5 7.89×10−8 0.2510 3.16×10−5 1.03×10−7 0.33100 3.16×10−5 1.55×10−7 0.49

Table 3: Comparison of relative errors for different choices of r in Example 2 with b = 1000;here Std. error denotes the standard deviation of the estimator of Pτb < ∞ based on 10,000simulation runs; CV denotes the empirically observed coefficient of variation

These comprised the large deviations probability of a random walk exceeding large values as wellas the level crossing probability of a negative-drift random walk. In the existing literature thereare results that suggest that state-independent methods for such probabilities are difficult todesign. Significant research over the last few years has resulted in sophisticated state-dependentimportance sampling techniques for estimating these probabilities. Our key contribution hasbeen to challenge this view by showing that simple state-independent importance samplingmethods, that are at least as efficient as the existing state-dependent methods, can indeed bedevised to estimate these probabilities.

Our approach relied on partitioning the rare event of interest into elementary events thatwere amenable to straight forward state-independent importance sampling methods. We expectthat this approach will generalize to more complex, multi-dimensional problems, and for similarproblems involving Weibull-type sub-exponential tail distributions.

24

References

R. J. Adler, R. E. Feldman, and M. S. Taqqu, editors. A practical guide to heavy tails. BirkhauserBoston Inc., Boston, MA, 1998. ISBN 0-8176-3951-9. Statistical techniques and applications,Papers from the workshop held in Santa Barbara, CA, December 1995.

A. Agarwal, S. Juneja, and S. Dey. Efficient simulation of large deviations events for sums ofrandom vectors using saddle point representations. Journal of Applied Probability, 2013.

S. Asmussen and P. W. Glynn. Stochastic simulation: algorithms and analysis, volume 57 ofStochastic Modelling and Applied Probability. Springer, New York, 2007. ISBN 978-0-387-30679-7.

S. Asmussen and C. Kluppelberg. Large deviations results for subexponential tails,with applications to insurance risk. Stochastic Processes and their Applications, 64(1):103 – 125, 1996. ISSN 0304-4149. doi: 10.1016/S0304-4149(96)00087-7. URLhttp://www.sciencedirect.com/science/article/pii/S0304414996000877.

S. Asmussen and D. P. Kroese. Improved algorithms for rare event simulation with heavy tails.Adv. in Appl. Probab., 38(2):545–558, 2006. ISSN 0001-8678. doi: 10.1239/aap/1151337084.URL http://dx.doi.org/10.1239/aap/1151337084.

S. Asmussen, K. Binswanger, and B. Højgaard. Rare events simulation for heavy-tailed dis-tributions. Bernoulli, 6(2):303–322, 2000. ISSN 1350-7265. doi: 10.2307/3318578. URLhttp://dx.doi.org/10.2307/3318578.

A. Bassamboo, S. Juneja, and A. Zeevi. On the inefficiency of state-independent importancesampling in the presence of heavy tails. Oper. Res. Lett., 35(2):251–260, 2007. ISSN 0167-6377.doi: 10.1016/j.orl.2006.02.002. URL http://dx.doi.org/10.1016/j.orl.2006.02.002.

A. Bassamboo, S. Juneja, and A. J. Zeevi. Portfolio credit risk with extremal dependence:Asymptotic analysis and efficient simulation. Operations Research, 56(3):593–606, 2008.

J. Blanchet and P. Glynn. Efficient rare-event simulation for the maximum of heavy-tailedrandom walks. Ann. Appl. Probab., 18(4):1351–1378, 2008. ISSN 1050-5164. doi: 10.1214/07-AAP485. URL http://dx.doi.org/10.1214/07-AAP485.

J. Blanchet and J. Liu. Efficient simulation and conditional functional limit theoremsfor ruinous heavy-tailed random walks. Stochastic Processes and their Applications,122(8):2994 – 3031, 2012. ISSN 0304-4149. doi: 10.1016/j.spa.2012.05.001. URLhttp://www.sciencedirect.com/science/article/pii/S0304414912000877.

J. Blanchet, P. Glynn, and J. Liu. Fluid heuristics, lyapunov bounds and effi-cient importance sampling for a heavy-tailed g/g/1 queue. Queueing Systems, 57(2-3):99–113, 2007. ISSN 0257-0130. doi: 10.1007/s11134-007-9047-4. URLhttp://dx.doi.org/10.1007/s11134-007-9047-4.

J. H. Blanchet and J. Liu. State-dependent importance sampling for regularly varyingrandom walks. Adv. in Appl. Probab., 40(4):1104–1128, 2008. ISSN 0001-8678. URLhttp://projecteuclid.org/getRecord?id=euclid.aap/1231340166.

J. H. Blanchet, K. Leder, and P. W. Glynn. Efficient simulation of light-tailed sums: anold-folk song sung to a faster new tune. . .. In Monte Carlo and quasi-Monte Carlo meth-ods 2008, pages 227–248. Springer, Berlin, 2009. doi: 10.1007/978-3-642-04107-5 13. URLhttp://dx.doi.org/10.1007/978-3-642-04107-5_13.

25

http://www.sciencedirect.com/science/article/pii/S0304414996000877

http://dx.doi.org/10.1239/aap/1151337084

http://dx.doi.org/10.2307/3318578

http://dx.doi.org/10.1016/j.orl.2006.02.002

http://dx.doi.org/10.1214/07-AAP485

http://www.sciencedirect.com/science/article/pii/S0304414912000877

http://dx.doi.org/10.1007/s11134-007-9047-4

http://projecteuclid.org/getRecord?id=euclid.aap/1231340166

http://dx.doi.org/10.1007/978-3-642-04107-5_13

A. A. Borovkov and K. A. Borovkov. Asymptotic analysis of random walks, volume118 of Encyclopedia of Mathematics and its Applications. Cambridge University Press,Cambridge, 2008. ISBN 978-0-521-88117-3. doi: 10.1017/CBO9780511721397. URLhttp://dx.doi.org/10.1017/CBO9780511721397. Heavy-tailed distributions, Translatedfrom the Russian by O. B. Borovkova.

H. Chan, S. Deng, and T. Lai. Rare-event simulation of heavy-tailed random walks by sequentialimportance sampling and resampling. Preprint, 2012.

A. B. Dieker and M. Mandjes. On asymptotically efficient simulation of large deviation prob-abilities. Adv. in Appl. Probab., 37(2):539–552, 2005. ISSN 0001-8678. doi: 10.1239/aap/1118858638. URL http://dx.doi.org/10.1239/aap/1118858638.

P. Dupuis and H. Wang. Importance sampling, large deviations, and differential games. Stoch.Stoch. Rep., 76(6):481–508, 2004. ISSN 1045-1129. doi: 10.1080/10451120410001733845.URL http://dx.doi.org/10.1080/10451120410001733845.

P. Dupuis, K. Leder, and H. Wang. Importance sampling for sums of random variables with reg-ularly varying tails. ACM Trans. Model. Comput. Simul., 17(3), July 2007. ISSN 1049-3301.doi: 10.1145/1243991.1243995. URL http://doi.acm.org/10.1145/1243991.1243995.

P. Embrechts, C. Kluppelberg, and T. Mikosch. Modelling extremal events, volume 33 of Ap-plications of Mathematics (New York). Springer-Verlag, Berlin, 1997. ISBN 3-540-60931-8.For insurance and finance.

S. Foss, D. Korshunov, and S. Zachary. An introduction to heavy-tailed and subexponen-tial distributions. Springer Series in Operations Research and Financial Engineering.Springer, New York, 2011. ISBN 978-1-4419-9472-1. doi: 10.1007/978-1-4419-9473-8. URLhttp://dx.doi.org/10.1007/978-1-4419-9473-8.

P. Glasserman and J. Li. Importance sampling for portfolio credit risk. Management Science,51(11):1643–1656, 2005.

P. W. Glynn and W. Whitt. The asymptotic efficiency of simulation estimators. Oper.Res., 40(3):505–520, 1992. ISSN 0030-364X. doi: 10.1287/opre.40.3.505. URLhttp://dx.doi.org/10.1287/opre.40.3.505.

J. M. Hammersley and D. C. Handscomb. Monte Carlo methods. Methuen & Co. Ltd., London,1965.

S. Juneja and P. Shahabuddin. Simulating heavy tailed processes using delayed hazard ratetwisting. ACM Trans. Model. Comput. Simul., 12(2):94–118, Apr. 2002. ISSN 1049-3301.doi: 10.1145/566392.566394. URL http://doi.acm.org/10.1145/566392.566394.

T. Mikosch and A. V. Nagaev. Large deviations of heavy-tailed sums with applications ininsurance. Extremes, 1(1):81–110, 1998. ISSN 1386-1999. doi: 10.1023/A:1009913901219.URL http://dx.doi.org/10.1023/A:1009913901219.

S. Parekh and J. Walrand. A quick simulation method for excessive backlogs in networks ofqueues. IEEE Trans. Automat. Control, 34(1):54–66, 1989. ISSN 0018-9286. doi: 10.1109/9.8649. URL http://dx.doi.org/10.1109/9.8649.

26

http://dx.doi.org/10.1017/CBO9780511721397

http://dx.doi.org/10.1239/aap/1118858638

http://dx.doi.org/10.1080/10451120410001733845

http://doi.acm.org/10.1145/1243991.1243995

http://dx.doi.org/10.1007/978-1-4419-9473-8

http://dx.doi.org/10.1287/opre.40.3.505

http://doi.acm.org/10.1145/566392.566394

http://dx.doi.org/10.1023/A:1009913901219

http://dx.doi.org/10.1109/9.8649

A. M. K. Rajhaa and S. Juneja. State-independent importance sampling for estimating largedeviation probabilities in heavy-tailed random walks. In Performance Evaluation Methodolo-gies and Tools (VALUETOOLS), 2012 6th International Conference on, pages 127 –135, oct.2012.

S. I. Resnick. Heavy tail modeling and teletraffic data. Ann. Statist., 25(5):1805–1869, 1997. ISSN 0090-5364. doi: 10.1214/aos/1069362376. URLhttp://dx.doi.org/10.1214/aos/1069362376. With discussion and a rejoinder by the au-thor.

J. S. Sadowsky. On Monte Carlo estimation of large deviations probabilities. Ann. Appl.Probab., 6(2):399–422, 1996. ISSN 1050-5164. doi: 10.1214/aoap/1034968137. URLhttp://dx.doi.org/10.1214/aoap/1034968137.

J. S. Sadowsky and J. A. Bucklew. On large deviations theory and asymptotically efficientMonte Carlo estimation. IEEE Trans. Inform. Theory, 36(3):579–588, 1990. ISSN 0018-9448.doi: 10.1109/18.54903. URL http://dx.doi.org/10.1109/18.54903.

D. Siegmund. Importance sampling in the Monte Carlo study of sequential tests. Ann. Statist.,4(4):673–684, 1976. ISSN 0090-5364.

Appendix

Here we present proofs of Lemmas 1, 2, 4 and 5. To prove Lemmas 1 and 4, we need Lemmas 8and 9, which are stated and proved below. The proof of Lemma 8 follow the lines of Borovkovand Borovkov (2008), where bounds for similar integrals have been derived.

Lemma 8. For any pair of sequences xn, φn satisfying xn ր ∞ and φnxn ր ∞, theintegral,

∫ xn

−∞eφnxF (dx) ≤ 1 + cφκ

n + e2αF

(

2α

φn

)

+ eφnxnF (xn)(1 + o(1)), as n ր ∞,

for any 0 < κ < α ∧ 2, and some constant c which does not depend on n and b.

Proof. We split the region of integration into (−∞, γ/φn] and (γ/φn, xn] for some constantγ > 0; the partition is such that the integrand stays bounded in the former despite its growth

to (−∞,∞). Let I1 :=∫ γ/φn

−∞ eφnxF (dx) and I2 :=∫ xn

γ/φneφnxF (dx). Since eφnx ≤ 1 + φnx +

φκn|x|κeφnx,

I1 ≤∫ γ/φn

−∞F (dx) + φn

∫ γ/φn

−∞xF (dx) + φκ

n

∫ γ/φn

−∞|x|κeφnxF (dx)

≤∫ ∞

−∞F (dx) + φn

∫ ∞

−∞xF (dx) + φκ

neγ

∫ ∞

−∞|x|κF (dx)

= 1 + cφκn, (35)

where c := eγ∫∞−∞ |x|κF (dx) < ∞ because E|X|κ < ∞; this follows because κ < α and from

Assumption 1. We have also used EX = 0 to arrive at (35). Integrating by parts for the second

27

http://dx.doi.org/10.1214/aos/1069362376

http://dx.doi.org/10.1214/aoap/1034968137

http://dx.doi.org/10.1109/18.54903

integral I2 :

I2 = −∫ xn

γ/φn

eφnxF (dx)

= eφnγ/φnF (γ/φn)− eφnxnF (xn) + φn

∫ xn

γ/φn

eφnxF (x)dx

≤ eγF (γ/φn) + I ′2, (36)

where, I ′2 := φn

∫ xn

γ/φneφnxF (x)dx. Now the change of variable u = φn(xn − x) results in:

I ′2 = eφnxn

∫ φnxn−γ

0e−uF

(

xn − u

φn

)

du

= eφnxnF (xn)

∫ φnxn−γ

0e−ugn(u)du, (37)

where,

gn(u) :=F(

xn − uφn

)

F (xn)=

F(

xn

(

1− uφnxn

))

F (xn).

Since L(·) is slowly varying and φnxn → ∞, given any δ > 0, for all n large enough we have:

(1− δ)

(

1− u

φnxn

)−α+δ

≤ gn(u) ≤ (1 + δ)

(

1− u

φnxn

)−α−δ

.

This preliminary fact about slowly varying functions can be found in, e.g., Theorem 1.1.4 ofBorovkov and Borovkov (2008). So for any fixed u, we have gn(u) → 1 as n ր ∞. Now fixδ = α

2 . Then for n large enough,

gn(u) ≤(

1 +α

2

)

(

1− u

φnxn

)− 3α2

. (38)

Let h(u) = (1− u/φnxn)− 3α

2 . Since log h(0) = 0 and ddu (log(h(u)) ≤ 3α

2γ for 0 ≤ u ≤ φnxn − γ,

we have h(u) ≤ e3αu2γ on the same interval. Therefore if we choose γ = 2α, the integrand in I ′2

is bounded for large enough n by an integrable function as below:

∣

∣e−ugn(u)10≤u≤φnxn−γ

∣

∣ ≤∣

∣

∣e−u(

1 +α

2

)

h(u)10≤u≤φnxn−γ

∣

∣

∣ ≤(

1 +α

2

)

e−u+ 3αu

2γ =(

1 +α

2

)

e−u4 .

Applying dominated convergence theorem, we get

∫ φnxn−γ

0e−ugn(u)du ∼ 1 as n ր ∞.

Since∫ xn

−∞ eφnxF (dx) = I1 + I2, combining this result with (35), (36) and (37), completes theproof.

Lemma 9. Given any ǫ > 0, uniformly for b > nβ+ǫ, we have:

(a) nθκn,b ց 0 for some 0 < κ < α, and

(b) F(

2αθn,b

)

= O(

1n

)

, as n ր ∞.

28

Proof. (a) We have F (x) = L(x)xα . Since L(·) is slowly varying, given any δ > 0 for sufficiently

large values of b, we have b−δ ≤ L(b) ≤ bδ, thus yielding L(b) = bo(1) as b ր ∞. Furthernoting that b > nβ+ǫ helps us to write:

nθκn,b =n

bκlogκ

(

1

nF (b)

)

≤ n1−κ(β+ǫ) logκ(

bα

nL(b)

)

.

If we take,

κ :=

2, if α > 2

(1 + ǫ)/( 1α + ǫ), if 1 < α ≤ 2(39)

then κ < α, and κ(β + ǫ) ≥ 1 + ǫ/2. Then nθκn,b ց 0 as n ր ∞, uniformly for b > nβ+ǫ.

(b) We have θn,b := − log(

nF (b))

/b. Therefore,

nF

(

2α

θn

)

= nF (b)F(

2αb− log(nF (b))

)

F (b).

Since F (·) is regularly varying, given any δ > 0, for n large enough,

F(

2αb− log(nF (b))

)

F (b)≤(

− log(

nF (b))

2α

)α+δ

.

The above inequality is just an application of Theorem 1.1.4 of Borovkov and Borovkov(2008). Therefore,

nF

(

2α

θn

)

≤ nL(b)

bα

(

− log(

nF (b))

2α

)α+δ

= o(1), uniformly for b > nβ+ǫ as n ր ∞.

Here the convergence to 0 is justified because α > 2 and b > nβ+ǫ.

Proof of Lemma 1. From the definition of Λb(·) and Lemma 8, we have:

exp (Λb(θn,b)) =

∫ b

−∞exp(θn,bx)F (dx)

≤ 1 + cθκn,b + e2αF

(

2α

θn,b

)

+ exp(θn,b)F (b)(1 + o(1)),

for κ as in (39). Usage of Lemma 8 is justified because θnb = − log(

nF (b))

ր ∞. The lastterm,

exp(θn,bb)F (b) =1

nF (b)F (b) =

1

n.

From Lemma 9, we have nθκn,b = o (1) and F (2α/θn,b) = o (1/n) , uniformly for b > nβ+ǫ.

Therefore, exp (Λb(θn)) ≤ 1 +1

n(1 + o(1)) , as n ր ∞.

29

Proof of Lemma 2 It is enough to show that, given ǫ > 0, there exists bǫ such that for allb > bǫ,

infk≥1

Pnk−1 < τb ≤ nk, AkPnk−1 < τb ≤ nk

> 1− ǫ.

Pnk−1 < τb ≤ nk, Ak =

nk∑

j=nk−1+1

Pτ = j,Ak

≥nk∑

j=nk−1+1

P τ = j, Si > −(A+ iδ) for all i < j,Xj > b+A+ j(µ + δ) ,

for some positive constants A and δ. Let Mn := maxk≤n(Sk − kµ) and M := supk(Sk − kµ).Then,

Pnk−1 < τb ≤ nk, Ak ≥nk∑

j=nk−1+1

P Mj−1 ≤ b, Si > −(A+ iδ) for all i < j,Xj > b+A+ j(µ + δ)

≥nk∑

j=nk−1+1

P Mj−1 ≤ b, Si > −(A+ iδ) for all i < j F (b+A+ j(µ + δ))

≥ P M ≤ b, Si > −(A+ iδ) for all ink∑

j=nk−1+1

F (b+A+ j(µ + δ)).

Since PM > b = o(1), as b ր ∞, by union bound, we have:

P (M > b ∪ Si < −(A+ iδ) for some i) ≤ ǫ+ PSi < −(A+ iδ) for some i.

Due to the law of large numbers, we can find iǫ such that for all i > iǫ, Si is larger than −(A+iδ)with probability at least 1 − ǫ. Further for the collection (Si : i ≤ iǫ), we can choose A largeenough such that for all i < iǫ, Si is larger than −A with probability at least 1− ǫ. Then,

P (M > b ∪ Si < −(A+ iδ) for some i) ≤ 2ǫ,

and hence,

Pnk−1 < τb ≤ nk, Ak ≥ (1− 2ǫ)

nk∑

j=nk−1+1

F (b+A+ j(µ + δ)). (40)

Now consider,

nk∑

j=nk−1+1

F (b+A+ j(µ+ δ)) ≥nk∑

j=nk−1+1

∫ j+1

jF (b+A+ u(µ+ δ))du

≥∫ nk

nk−1+1F (b+A+ u(µ + δ))du.

After changing the variables of integration, we get:

nk∑

j=nk−1+1

F (b+A+ j(µ + δ)) ≥ 1

µ+ δ

∫ b+A+nk(µ+δ)

b+A+(nk−1+1)(µ+δ)F (u)du

=FI (b+A+ (nk−1 + 1)(µ + δ)) − FI (b+A+ nk(µ + δ))

µ+ δ.

30

Then from (40),

Pnk−1 < τb ≤ nk, Ak ≥ 1− 2ǫ

µ+ δ

(

FI (b+A+ (nk−1 + 1)(µ + δ)) − FI (b+A+ nk(µ+ δ)))

.

Since δ is arbitrary and FI(·) is long-tailed, for values of b large enough, we have:

Pnk−1 < τb ≤ nk, Ak ≥ (1− 2ǫ)(1 − ǫ)

µ

(

FI (b+ (nk−1 + 1)µ)− FI (b+ nkµ))

.

Now from (8), for all k,

Pnk−1 < τb ≤ nk, Ak ≥ (1− 2ǫ)(1 − ǫ)2Pnk−1 < τb ≤ nk,for large values of b, thus proving the claim.

Proof of Lemma 4. Consider θ : R+ → R+. From Lemma 8, we have that: for given ǫ > 0,

if xθ(x) ր ∞, then there exists xǫ such that for all x > xǫ,∫ x

−∞eθ(x)uF (du) ≤ 1 + cθ1+δ(x) + e2αF

(

2α

θ(x)

)

+ eθ(x)xF (x)(1 + ǫ),

for some δ > 0. For this, we do not need any condition on left tail as in Assumption 1. Bydefinition of θk(b) in (20), we have (b + nk−1µ).θk(b) ր ∞, either if b or k grows to infinity.Expressing θk(b) as θk, for values of b and k satisfying b+ nk−1µ > xǫ, we have,

exp (Λk(θk)) ≤ 1 + cθ1+δk + e2αF

(

2α

θk

)

+ eθk·(b+nk−1µ)F (b+ nk−1µ)(1 + ǫ)

≤ exp

(

cθ1+δk + e2αF

(

2α

θk

)

+1

nk(1 + ǫ)

)

,

because 1 + x ≤ ex and eθk ·(b+nk−1µ)F (b+ nk−1µ) = 1/nk. Then,

exp (nkΛk(θk)) ≤ exp

(

cnkθ1+δk + e2αnkF

(

2α

θk

)

+ 1 + ǫ

)

. (41)

Also see that,

nkθ1+δk =

nk

(b+ nk−1µ)1+δ

(

log

(

1

nkF (b+ nk−1µ)

))1+δ

< ǫ, (42)

if b and k are such that (b + nk−1µ) is large enough. Similarly for given δ > 0, there exists xδsuch that if b+ nk−1µ > xδ, then

F(

2αθk

)

F (b+ nk−1µ)=

F

(

2α(b+nk−1µ)

− log(nkF (b+nk−1µ))

)

F (b+ nk−1µ)≤(

1

2αlog

(

1

nkF (b+ nk−1µ)

))α+δ

.

Then for values of b and k such that (b+ nk−1µ) is large enough,

nkF

(

2α

θk

)

≤ nkF (b+ nk−1µ)

(

1

2αlog

(

1

nkF (b+ nk−1µ)

))α+δ

=nkL(b+ nk−1µ)

(b+ nk−1µ)α

(

1

2αlog

(

1

nkF (b+ nk−1µ)

))α+δ

< ǫ,

because α > 2. Combining this with (41) and (42), for b and k such that b+nk−1µ is sufficientlylarge,

exp (nkΛk(θk)) ≤ exp(1 + 3ǫ),

thus establishing the claim.

31

Proof of Lemma 5. From the uniform asymptotic (8), given ǫ > 0, for b large enough andfor any k ≥ 1, we have:

nkF (b+ nk−1µ)

Pnk−1 < τb ≤ nk=

nkF (b+ nk−1µ)1µ

∫ b+nkµb+nk−1µ

F (u)du·

1µ

∫ b+nkµb+nk−1µ

F (u)du


≤ (1 + ǫ)nkF (b+ nk−1µ)

1µ F (b+ nkµ)(nkµ− nk−1µ)

.

For k > 1, nk = rnk−1; then,

nkF (b+ nk−1µ)

Pnk−1 < τb ≤ nk≤ (1 + ǫ)

nkF (b+ nk

r µ)nk

r F (b+ nkµ)

≤ (1 + ǫ)F (1r (b+ nkµ))1r F (b+ nkµ)

≤ (1 + ǫ)2rα+1,

for values of b large enough, and for any k ≥ 1, because of the regularly varying nature of F (·).Also,

n1F (b+ n0µ)

Pn0 < τb ≤ n1≤ (1 + ǫ)

rF (b)

rF (b+ rµ)≤ (1 + ǫ)2,

for large values of b, because F (·) is long-tailed. Thus for any k, we can find a constant c2 suchthat the claim holds.

32

arxiv:1206.3390v2 [math.pr] 29 may 2013jblanche/papers/siis_rw_ht.pdf · arxiv:1206.3390v2...

Documents