monte-carlo methods for the estimation of rare event probabilities_leder

Upload: wei-guo

Post on 04-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    1/35

    Monte-Carlo Methods for the Estimation of

    Rare Event Probabilities

    Kevin Leder

    December 2, 2011

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    2/35

    Outline

    1 Introduction

    2 Importance Sampling

    3 Splitting Method

    4 Jackson Network

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    3/35

    Introduction

    Estimation of Small Probabilities via Monte Carlo

    Why try to estimate the probability of rare events? Arent theyjust 0?

    Phrase in terms of probabilities and random variables, there is a

    random variable Z and a set A such thatP

    (Z A) 0.Useful to embed rare event into sequence of rare events andstudy asymptotic properties i.e., consider the events {Xn A}.

    We are interested in setting where the probabilities decayexponentially, i.e. there exists a > 0 such that

    limn

    1

    nlogP (Zn A) = < 0.

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    4/35

    Introduction

    Estimating Rare Event Probabilities via Standard

    Monte Carlo

    Suppose we are interested in estimating P (Zn A) for somefixed n.

    For a large integer k draw an i.i.d. vector (Zn1 , . . . , Znk ) then form

    the estimator

    pn,k = 1K

    kj=1

    1A(Znj ),

    which is unbiased and consistent.

    Consider relative error of estimator though

    RE(pn,k) =sd(pn,k)

    E[pn,k]=

    1 P(Zn A)

    kP(Zn A)

    1kP(Zn A)

    .

    Number of replications k has to grow with 1/P(Xn A) to keep

    relative error bounded.

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    5/35

    Introduction

    Two Solutions

    Importance sampling: Simulate system under alternativedynamics so that event of interest is no longer rare. Keep track

    of likelihood ratio so that you can renormalize final answer tocreate unbiased estimator.

    Particle based methods: Simulate lots of correlated copies ofsystem under original dynamics, these methods can be viewedas a type of branching random walk.

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    6/35

    Importance Sampling

    Importance Sampling

    Importance Sampling

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    7/35

    Importance Sampling

    Importance Sampling

    For estimating pn = P(Zn A), first construct new sampling

    measure Q then form estimator by averaging independentreplications of

    pn =dP

    dQ1A(Z

    n),

    where Zn is sampled according to measure Q.

    Judge the performance of pn via its variance, (or equivalently2nd moment)

    EQ[p2n] = E[pn].

    In order to control relative error would like strong efficiency

    supn

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    8/35

    Importance Sampling

    Importance Sampling for Random Walks

    Consider estimating pn(A) = P(Zn/n A) where

    Zn = X1 + . . . + Xn and {Xi} is an i.i.d sequence of ddimensional random vectors that satisfy

    () = logE[e,Y] < ,

    for in a neighborhood of the origin. Assume that E[X1] / A.

    Create a sampling measure by exponential tilt of each incrementof Xi, for each R

    d define a sampling measure

    Q(X1 dx1, . . . , Xn dxn) =e,x1

    e()P(X1 dx1)

    e,xn

    e()P(Xn dxn)

    For this change of measure denote our estimator of pn(A) bypn(A, ), then 2nd moment single replication of estimator is

    E[pn(A, )] = E[1{Zn/nA}en((),Zn/n)].

    Study asymptotic properties of 2nd moment via large deviation

    theory.

    I S li

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    9/35

    Importance Sampling

    Large Deviations Principle

    A sequence of random variables {Zn

    } taking values in a Polishspace X satisfy a large deviations principle (LDP) with ratefunction I : X [0, ] if

    1 I has compact level sets2 For every Borel set A X

    infxA

    I(x) lim infn

    1

    nlogP(Zn A)

    lim supn

    1

    nlogP(Zn A) inf

    xAI(x)

    A useful alternative formulation: for any bounded and continuousf : X R the following holds

    limn

    1

    nlogEenf(Z

    n) = infxX

    [f(x) + I(x)].

    I t S li

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    10/35

    Importance Sampling

    Large Deviations for Random Walks

    Suppose that Zn = X1 + . . . + Xn for an i.i.d sequence {Xi} of ddimensional vector that satisfy

    () = logE[e,Y] < ,

    for in a neighborhood of the origin.

    Then Zn/n satisfies an LDP with rate function (CramersTheorem)

    I() = supRd

    [, ()].

    In the 1-d setting, if a> E[X1] then Cramers theorem gives that

    P(Zn > na) = en(I(a)+o(1)),

    where I(a) = infx>aI(x) = I(a) = aa (a) and a solves(a) = a.

    Importance Sampling

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    11/35

    Importance Sampling

    Using Large Deviations for Importance Sampling

    Using Cramers Theorem and alternative formulation of LDP we

    can approximate 2nd moment of IS estimator for large n,1

    nlogE[1{Zn/nA}e

    n((),Zn/n)] infxA

    [I(x) () + , x].

    The goal of importance sampling is to minimize variance whichgives following minmax problem

    supRd

    infxA

    [I(x) () + , x].

    If A is convex then

    supRd

    infxA

    [I(x) () + , x] = infxA

    supRd

    [I(x) () + , x]

    = 2 infxA

    I(x).

    Which Rd to use? Let x = arginfxAI(x) then we usechange of measure defined by using tilte x which is solution ofD(x) = x

    .

    If A is convex, this is a logarithmically efficient estimator.

    Importance Sampling

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    12/35

    Importance Sampling

    Importance of Convexity

    Glasserman and Yang (97) consider the problem of estimatingP(|Zn| > 1.5n) where the increments Xi = Ai Bi andAi N(1.5, 1) and B exp(1).

    If set were convex then we need to findx = arginfx:|x|>1.5I(x) = 1.5, then use change of measurebased on 1.5 i.e.,

    dP

    dQ(x1) = e

    1.5x1(1.5).

    However by pretending target set was convex we end up withterrible estimator

    lim supn

    E[pn(A, 1.5)] = .

    Importance Sampling

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    13/35

    Importance Sampling

    What went wrong?

    Normal paths undersampling measure

    Rogue paths under

    sampling measure

    Importance Sampling

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    14/35

    Importance Sampling

    A procedure for non-convex A

    Dupuis and Wang showed that for non-convex A logarithmic

    efficiency requires state-dependent changes of measure.

    Suppose that A = A1 . . . Am, where Aj are closed convexsets. Define yj = arginfjAjI(y) and j

    .= yj. Then a

    logarithmically efficient change of measure is given by usingtransition kernel

    Q(Xi dx|Zi1 = z) =m

    j=1

    rji (z)e

    j,xi(j)P(Xi dxi),

    The state-dependence mixture probabilities are described as

    follows

    rji (z) =

    wji(z)m

    k=1 wki (z)

    ,

    wherewki (z) = exp [nk, z yk + (n i)(k)] .

    Importance Sampling

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    15/35

    Importance Sampling

    Importance Sampling in Finance

    Many option pricing problems can be viewed as rare-event

    calculations. Option only has value on a small set of samplespace, so expected value is dominated by values on a rare set.

    Glasserman et al (99) looked at using importance sampling as acomputational tool for pricing a variety of path dependentoptions. In setting of concave payoff function they presentlogarithmically efficient procedure for pricing options.

    Gausoni and Robertson extended framework of the Glassermanpaper to a continuous time setting and establish that optimalchange of measure in continuous time setting can be found bysolving Euler-Lagrange equation. They assume that payofffunction is concave.

    Several works by Glasserman have considered use ofimportance sampling for estimating value at risk, and conditionalvalue at risk.

    Dupuis and Wang show that under very weak conditions on thepayoff functional adaptive importance sampling can be used to

    evaluate option prices with logarithmic efficiency.

    Splitting Method

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    16/35

    p g

    Splitting Method

    Particle Based Methods

    Splitting Method

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    17/35

    Splitting Method

    Will focus on a specific particle method called splitting method,

    first developed in Villen-Altamirano and Villen-Altamirano 94who called it RESTART.

    Dean and Dupuis 08 presented a procedure for construction ofefficient and stable splitting schemes. Will follow their notation.

    Model problem: Xn a sequence of stochastic processes on

    domain D Rd

    , and two disjoint sets A and B, define thesequence of stopping times n = min{i : Xn(i) A B}

    Goal estimate the probabilities

    pn(x) = P (Xn(n) B|X

    n(0) = x) .

    Assume that there is a non-negative measurable function L suchthat

    limn

    1

    nlog pn(x)

    = inf{t

    0

    L(s), (s) ds : (0) = 0, (t) B, (s) Ac for all s t}.

    Splitting Method

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    18/35

    The Splitting Algorithm

    Consider collection of nested setsB = Cn(0) Cn(1) . . . Cn(Mn)

    1 Initiate simulation procedure with a single particle starting fromposition x Cn(k) for some k 1. Let w1 = 1 initial weightassociated to particle.

    2 Evolve initial particle according to original transition kernel untileither it hits A (dies) or level Cn(k 1). If it hits Cn(k 1) it isreplaced by r identical particles (r > 1). Weight of descendantparticles is weight of parent particle 1/r.

    3 Procedure from step 1 is replicated for each descendant particle,

    carrying over the value of the weights at each level for thesurviving particles.

    4 Steps 1 to 3 are repeated until all particles have either died orreached level Cn(0) = B.

    Splitting Method

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    19/35

    Splitting Method

    A

    B

    x

    Cn(0)

    Cn(1)

    Cn(2)

    Splitting Method

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    20/35

    The Splitting Estimator

    Consider collection of nested setsB = Cn(0) Cn(1) . . . Cn(Mn) (Note will want Mn = c nfor some c > 0).

    Nested sets are based on level sets of an importance function,U. Specifically define Lz = {y D : U(y) z} then

    Cn(j) = L(j1)/n.

    An important function is the level function

    n(y) = min{j 0 : y Cn(j)},

    Estimator for pn(x) is

    Rn(x) = Nn(x)/rn(x),

    where Nn(x) is number of particles that made it to B.

    Splitting Method

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    21/35

    Analysis of Splitting Estimators

    For numerical stability want E[Nn(x)] rn(x)pn(x) to growsubexponentially, or rn(x)pn(x) = exp(o(n)).

    For logarithmically optimal 2nd moment require thatrn(x) = pn(x) exp(o(n)).

    Suppose we have a function W(x) such that

    pn(x) exp (nW(x) + o(n)) ,

    then it suffices to establish that

    n(x) log r nW(x) = o(n)

    Its easy to see that n(x) = nU(x) therefore we choose ourimportance function as U(x) = W(x)/ log(r).

    See Dean and Dupuis for details.

    Jackson Network

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    22/35

    Jackson Networks

    Performance Comparison: Overflow inJackson Networks

    Jackson Network

    O J k N k

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    23/35

    Open Jackson Networks

    Consider a network of d stations. Customers arrive to thenetwork with arrival rate = (1, . . . , d)

    T, and the service rateof the d stations is encoded by = (1, . . . , d)

    T.

    A job that leaves station i joins station j with probability Pi,j, andleaves the system with probability

    Pi,0 = 1 d

    j=1

    Pi,j,

    this is called the routing matrix.

    We are interested in stable open Jackson networks, that isi) i, either i > 0 or j1 Pj1j2 ...Pjki > 0 for some j1,...,jk.

    ii) i, either Pi0 > 0 or Pij1 Pj1j2 ...Pjk0 > 0 for some j1,...,jk.iii) The network is stable (i.e. a stationary distribution exists).

    Jackson Network

    B i P ti f J k N t k

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    24/35

    Basic Properties of Jackson Networks

    Assume without loss of generality thatd

    j=1(j + j) = 1.Under the stability assumption the following

    i = i +d

    j=1jPji, i = 1, 2,..., d

    has a unique solution T = T(I P)1.

    The traffic intensity at station i is in equilibrium is given byi = i/i (0, 1).

    Define = max1id i, and then set = |{i : i = }.

    Study system through embedded discrete time Markov chainQ = {Q(k) : k 0} where Q(k) = (Q1(k), . . . , Qd(k)), andQi(k) represents number of customers in station i immediatelyafter kth transition.

    Jackson Network

    O fl P b biliti i J k N t k

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    25/35

    Overflow Probabilities in Jackson Networks

    Consider a subset of stations encoded by the vector v, denote

    the total population in this subset by Nv(x) = x, v.Will be interested in the following probability:

    pvn = P { total population in stations encoded by v reaches

    n before returning to 0, starting from 0}.

    Can also define pvn via stopping times

    T{x} inf{k 1 : Q(k) = x},

    TVn inf{k 1 : Nv (Q(k)) n}.

    If we define P()

    .

    = P(|Q(0) = x) thenpVn = P0(T

    Vn T{0}).

    or more generally

    pV

    n

    (x) = Px(TV

    n

    T{0}).

    Jackson Network

    D mi f Q

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    26/35

    Dynamics of Q

    Queue length process is just a state-dependent random walk

    Q(k + 1) = Q(k) + (Q(k), Y (k + 1)) ,

    is a reflection function that prevents the queue-length processfrom taking negative values.

    The noise term Y(k) represents outcome of next transition andhas following pdf

    P (Y (k) = w) =

    i arrival at station i,

    iPij dep. at station i goes to station j,iPi0 dep. at station i leaves sys.

    Jackson Network

    Logarithmic Asymptotics of Overflow Probabilities in

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    27/35

    Logarithmic Asymptotics of Overflow Probabilities in

    Jackson Networks I

    Large deviations theory dictates the existence of a function W

    pVn (x/n) = exp (nWV (x/n) + o(n)) .

    By looking at Q/n we have the following via formal Taylor

    expansion

    1 =1

    pVn (x/n)E

    pVn (x/n+

    1

    n(x/n, Y (1)))

    Eexp{nWV[x/n+1

    n

    (x/n, Y (1))] + nWV (x/n)}

    = Eexp{WV(x/n)T(x/n, Y (1)) + o(1)}

    = exp ( (x/n, WV (x/n)) + o(1)) ,

    where (x, ) = logE expT(x, Y (k)).

    Jackson Network

    Logarithmic Asymptotics of Overflow Probabilities in

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    28/35

    Logarithmic Asymptotics of Overflow Probabilities in

    Jackson Networks II

    In order to characterize logarithmic asymptotics of pn(x/n) needto find function WV that satisfies

    (x/n, WV(x/n)) = 0,

    or for an asymptotic logarithmic upper bound find WV thatsatisfies

    (x/n, WV(x/n)) 0.

    A function that satisfies this condition is

    WV(x/n) = , x/n log V ,

    where i = log i and V = max{i : vi = 1}.

    Build our splitting scheme out of this function, i.e. theimportance function is given by U(x/n) = WV(X/n)/ log(r).

    Jackson Network

    Logarithmically Efficient Estimation of Overflow

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    29/35

    Logarithmically Efficient Estimation of Overflow

    Probabilities

    Dean and Dupuis established that if we use the importancefunction U then the splitting estimator for pVn (x) is logarithmicallyefficient, and number of particles created grows

    subexponentially in n.Similarly Dupuis and Wang (09) established that usingsubsolutions to PDE from previous slide you can constructlogarithmically efficient IS estimators for overflow probabilities inJackson networks.

    How do we then evaluate relative merits of two algorithms?Requires refined knowledge of performance characteristics, not

    just logarithmic scale.

    Jackson Network

    Asymptotics of Overflow Probabilities in Jackson

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    30/35

    Asymptotics of Overflow Probabilities in Jackson

    Networks

    Stationary distribution of Jackson networks:

    (m1,..., md) =d

    j=1

    P (Qj () = mj)

    =

    dj=1

    (1 j) mj

    j , j = 1,..., d, and mj 0.

    Can use this result and a time-reversal argument to show that ifx is in a compact set then there exists k0 and k1 such that

    lim supn

    pvn(x)

    envnv1 k1

    lim infn

    pvn(x)

    envnv1 k0

    where where v = log v, in which v = max{i : vi = 1}; and

    v = i

    I{i = v

    , vi = 1}. See Blanchet (11), or Blanchet,Leder, Shi (11).

    Jackson Network

    Computational Effort for Single Run of Splitting

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    31/35

    Computational Effort for Single Run of Splitting

    Algorithm

    In Blanchet, Leder, and Shi (11) we looked at the computationaleffort necessary to use a well designed splitting algorithm.

    Define C = logv

    log r, then rewrite importance function and level

    function as

    U(x/n) = C(1

    log vx

    n)

    n(x) = C(n x

    log v).

    Consider the total number of particles that make it to overflowset, can see that

    E[Nn(x)] = rn(x)pVn (x) cevnnv1rn(x).

    Notice that ev = elogv = eClog r = rC, so that

    E[Nn(x) ]cnv1rn(x)nC,

    and if we assume that x/n 0 then we have thatE[Nn(x)] cnv1

    Jackson Network

    Refined Performance of Splitting

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    32/35

    Refined Performance of Splitting

    From previous slide we saw that the number of particles tosurvive is nv1, the actual computational effort is on the orderof nv+1 as established in Blanchet, Leder and Shi (11).

    The computational effort required to achieve a fixed level ofrelative error is given by

    CnE[Rn(x)2]

    pVn (x)2

    ,

    where Cn is the computational cost per replication of theestimator i.e. roughly nv+1.

    In Blanchet, Leder, Shi (11) we establish thatE[Rn(x)

    2] = pVn (x)2O

    nV

    .

    Thus the computational cost of a well designed splittingalgorithm is O

    n2V+1

    .

    Jackson Network

    Importance Sampling for Tandem Jackson Network

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    33/35

    Importance Sampling for Tandem Jackson Network

    Dupuis, Sezer, and Wang considered estimating total population

    overflow in d node tandem network using sampling measuredefined as

    Q(Y(k) = z|Q(k 1) = x)

    Q(Y(k) = z|Q(k 1) = x)=

    dj=0

    rj(x/n) exp (j, z (j, x/n)) ,

    where

    rj(x/n) =wj(x/n)2

    j=0 wk(x/n), wj(x/n) = exp (nj, x/n + n+ jn)

    and

    (j)i =

    , 1 i d j0, otherwise

    Dupuis, Sezer and Wang established that this estimator islogarithmically efficient.

    Call associated estimator pn

    .

    Jackson Network

    Refined Performance of Importance Sampling

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    34/35

    Refined Performance of Importance Sampling

    In Blanchet, Glynn and Leder (11) we performed refined analysisof this estimator to compare with splitting and other methods.

    We know that cost of algorithm is roughly

    E[pn]

    e2nn22 .

    By direct analysis of likelihood ratio on event of interest we areable to establish that

    E[pn] = Oe2nn2d .We are able to establish that the computational complexity ofthis algorithm is O

    n2(d+1)

    .

    Jackson Network

    Comparing Performance on Estimating Overflow

  • 7/31/2019 Monte-Carlo Methods for the Estimation of Rare Event Probabilities_Leder

    35/35

    Comparing Performance on Estimating Overflow

    Probabilities in Tandem Networks

    Computational cost of splitting algorithm is O(n2+1).

    Computational cost of importance sampling algorithm isO(n2(d+1).

    Thus prefer importance sampling if more than half the stationsare bottlenecks, and splitting otherwise.

    Conjecture: This property holds for all Jackson networks, not

    just tandem network topology.