financial risk estimation using nested mlmc and portfolio

41
Financial risk estimation using nested MLMC and portfolio sub-sampling Mike Giles 1 Abdul-Lateef Haji-Ali 2 1 Oxford University 2 Heriot-Watt University ICIAM 19 July, 2019

Upload: others

Post on 21-Dec-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Financial risk estimation using nested MLMC andportfolio sub-sampling

Mike Giles1 Abdul-Lateef Haji-Ali2

1Oxford University

2Heriot-Watt University

ICIAM19 July, 2019

Outline

Risk applications and nested expectation.

Problem setting and a Monte Carlo estimator.

Estimating a sum.

Multilevel Monte Carlo (MLMC) for nested expectations.

MLMC + uniform inner sampling.

MLMC + adaptive inner sampling.

Concluding remarks.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 1 / 21

Risk analysis

Stochastic models are increasingly being adopted in real-lifeapplications.

An important question in such applications is assessing the risk ofsome extreme event:

I in finance: risk of loss, default or ruin,I in industrial modelling: risk of component failure,I in crowd modelling: risk of stampede,I . . .

Risk assessment is the first step to risk management.

Computing risk measures is computationally difficult becauseI extreme events are extremely rare,I the risk measures are not smooth (either the event happened or not),I and the underlying stochastic models are difficult to evaluate (or

expensive to approximate).

In this work, we address the last two points.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 2 / 21

Risk analysis

Stochastic models are increasingly being adopted in real-lifeapplications.

An important question in such applications is assessing the risk ofsome extreme event:

I in finance: risk of loss, default or ruin,I in industrial modelling: risk of component failure,I in crowd modelling: risk of stampede,I . . .

Risk assessment is the first step to risk management.

Computing risk measures is computationally difficult becauseI extreme events are extremely rare,I the risk measures are not smooth (either the event happened or not),I and the underlying stochastic models are difficult to evaluate (or

expensive to approximate).

In this work, we address the last two points.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 2 / 21

Nested expectation in risk applications

The “losses” are modelled by P random variables {Xi}Pi=1.

{Xi}Pi=1 depend on another (multi-dimensional) random variable Y ,the risk factor.

The expected loss for a given risk factor is

Λ = E

[1

P

P∑i=1

Xi

∣∣∣∣∣Y].

We are interested in computing probability of the expected lossexceeding Λη as

η = P[ Λ>Λη ] = E

[H

(E

[1

P

P∑i=1

Xi − Λη

∣∣∣∣∣Y]) ]

where H(·) is the Heaviside step function.

Key message: the probability of a large expected loss involves anested expectation.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 3 / 21

Nested expectation in risk applications

The “losses” are modelled by P random variables {Xi}Pi=1.

{Xi}Pi=1 depend on another (multi-dimensional) random variable Y ,the risk factor.

The expected loss for a given risk factor is

Λ = E

[1

P

P∑i=1

Xi

∣∣∣∣∣Y].

We are interested in computing probability of the expected lossexceeding Λη as

η = P[ Λ>Λη ] = E

[H

(E

[1

P

P∑i=1

Xi − Λη

∣∣∣∣∣Y]) ]

where H(·) is the Heaviside step function.

Key message: the probability of a large expected loss involves anested expectation.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 3 / 21

Nested estimation using Monte Carlo

E

[H

(E

[1

P

P∑i=1

Xi

∣∣∣∣∣Y]) ]

We use N inner samples of {Xi}Pi=1 to estimateH(E[X |Y ]) ≈ H

(XN(Y )

)where

XN(Y ) = N−1P−1N∑

n=1

P∑i=1

X(n)i (Y )

This leads to a bias of O(P−1N−1). Using Monte Carlo for the outerexpectation as well,

E[H(E[X |Y ])

]≈ 1

M

M∑m=1

H(XN(Y (m))

)leads to a sampling error of O(M−1/2).

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 4 / 21

Nested estimation using Monte Carlo

E

[H

(E

[1

P

P∑i=1

Xi

∣∣∣∣∣Y]) ]

We use N inner samples of {Xi}Pi=1 to estimateH(E[X |Y ]) ≈ H

(XN(Y )

)where

XN(Y ) = N−1P−1N∑

n=1

P∑i=1

X(n)i (Y )

This leads to a bias of O(P−1N−1). Using Monte Carlo for the outerexpectation as well,

E[H(E[X |Y ])

]≈ 1

M

M∑m=1

H(XN(Y (m))

)leads to a sampling error of O(M−1/2).

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 4 / 21

Nested estimation using Monte Carlo

E

[H

(E

[1

P

P∑i=1

Xi

∣∣∣∣∣Y]) ]

We use N inner samples of {Xi}Pi=1 to estimateH(E[X |Y ]) ≈ H

(XN(Y )

)where

XN(Y ) = N−1P−1N∑

n=1

P∑i=1

X(n)i (Y )

This leads to a bias of O(P−1N−1). Using Monte Carlo for the outerexpectation as well,

E[H(E[X |Y ])

]≈ 1

M

M∑m=1

H(XN(Y (m))

)leads to a sampling error of O(M−1/2).

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 4 / 21

Nested estimation using Monte Carlo

To achieve a root mean-square error ε choose

N = max(1,O(P−1ε−1)

)M = O(ε−2)

Cost of nested Monte Carlo estimator is M N P .

Hence complexity is O(max(Pε−2, ε−3

)).

Ideally we would like the complexity to be O(ε−2), independently of P.Hence we will

Devise a strategy to sample the sum with a complexity that isindependent of P so that the complexity is O(ε−3).

Use MLMC to reduce the complexity to almost O(ε−2).

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 5 / 21

Estimating a sum

Recall that we have to compute 1P

∑Pp=1 Xi for every sample of the risk

factors, Y . Here, we focus on a single computation for a single riskscenario.

Using a random sub-sampler we can approximate

1

P

P∑p=1

Xi =1

PE[Xj p

−1j ] ≈ 1

PN

N∑n=1

X(n)

j(n) p−1j(n)

where j is a random integer with P[ j = i ] = pi for i ∈ {1, . . . ,P}.The cost of this random sub-sampler is N while the MSE is bounded by

N−1P−2P∑i=1

E[X 2i ]p−1

i

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 6 / 21

Estimating a sum

Minimizing the MSE leads to the optimal expression for the probabilities

pi = g̃i

/ P∑k=1

g̃k

for g̃2i ≈ E[X 2

i ] and the optimal MSE

N−1P−2

(P∑i=1

E[X 2i ]

g̃i

)(P∑i=1

g̃i

)≈ N−1P−2

(P∑i=1

g̃i

)2

= O(N−1)

which is bounded for all P.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 7 / 21

In nested expectationHence, we write

E

[H

(E

[1

P

P∑i=1

Xi

∣∣∣∣∣Y]) ]

= E[H(E[X |Y ])

]where

X = P−1 Xj p−1j

and

pj = g̃j

/ P∑k=1

g̃k

for some sequence g̃k independent of Y , e.g., g̃k = E[X 2k ] so that the

optimal probabilities have to computed only once.

Using the random sub-sampler, the computational complexity isindependent of the number of terms P. Moreover, in some cases it can bereduced by a constant by using a mixed sub-sampler.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 8 / 21

Mixed sub-sampler

To illustrate the need for mixed sub-sampling, consider the simple example

1

P

P∑i=1

Xi

where all Xi terms are deterministic. A mixed estimator for 0 ≤ Q ≤ N is

1

P

P∑p=1

Xi =1

P

Q∑p=1

Xi +1

PE[Xj p

−1j

]

≈ 1

P

Q∑p=1

Xi +1

P(N − Q)

N−Q∑n=1

Xj(n) p−1j(n)

where j is a random integer with P[ j = i ] = pi for i ∈ {Q + 1, . . . ,P}.When Q = 0 we have a fully random sub-sampler and when Q = P we arecomputing the sum exactly.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 9 / 21

Mixed sub-sampler

Using the previous choice for pi the MSE is bounded by

(N − Q)−1P−2

(P∑

i=Q+1

Xi

)2

since the error contribution is only due to the random sub-sampler.Hence, by sub-sampling the largest Q terms deterministically andoptimizing with respect to Q, using a knapsack-type optimization, we canend up with a smaller MSE.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 10 / 21

Numerical illustrationFor P = 1000 and X 2

i being i.i.d. samples from exponential distributionwith rate 3.

1 10 102 103 1040

100

200

300

400

500

Histogram of {Xi}Pi=1

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 11 / 21

Numerical illustrationFor P = 1000 and X 2

i being i.i.d. samples from exponential distributionwith rate 3.

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

1

N/P

Q/P

Portion of terms which are sub-sampled deterministically

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 11 / 21

Numerical illustrationFor P = 1000 and X 2

i being i.i.d. samples from exponential distributionwith rate 3.

0 0.2 0.4 0.6 0.8 1

10−2

10−1

N/P

RM

SE

(Q)/

RM

SE

(0)

Ratio of RMSEs of mixed and fully random sub-samplers

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 11 / 21

MLMC for nested expectation

Next, we want to apply MLMC to nested expectation to reduce the overallcomplexity from O(ε−3) to O(ε−2).Building a hierarchy of L + 1 estimators with N` inner samples for` = 0, 1, . . . , L, the MLMC estimator is

E[P ] ≈L∑`=0

1

M`

M∑̀m=1

∆`P(`,m),

whereP = H(E[X |Y ]),

P` = H(XN`(Y )

),

∆`P(`,m) = P

(`,m)` − P

(`,m)`−1

= H(XN`(Y

(`,m)))−H

(XN`−1

(Y (`,m))),

and P−1 = 0.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 12 / 21

Multilevel Monte Carlo: summaryFor P ≈ P` and ∆`P = P` − P`−1 with P−1 = 0, we have

E[P ] =∞∑`=0

E[ ∆`P ] ≈L∑`=0

E[ ∆`P ] ≈L∑`=0

1

M`

M∑̀m=1

∆`P(`,m)

where ∆`P(`,m) is the (`,m)’th samples of ∆`P. Assuming

|E[P − P` ]| = O(

2−α`),

V` = Var[ ∆`P ] = O(

2−β`),

W` = O(2γ`),

where the work to sample ∆`P is W`, then there are optimal choices of Land M` so that the MLMC estimator has complexityO

(ε−2−max(0,γ−β

α )), when γ 6= β

O(ε−2|log ε|2

)otherwise.

c.f. MC: O(ε−2− γα )

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 13 / 21

Multilevel Monte Carlo: summaryFor P ≈ P` and ∆`P = P` − P`−1 with P−1 = 0, we have

E[P ] =∞∑`=0

E[ ∆`P ] ≈L∑`=0

E[ ∆`P ] ≈L∑`=0

1

M`

M∑̀m=1

∆`P(`,m)

where ∆`P(`,m) is the (`,m)’th samples of ∆`P. Assuming

|E[P − P` ]| = O(

2−α`),

V` = Var[ ∆`P ] = O(

2−β`),

W` = O(2γ`),

where the work to sample ∆`P is W`, then there are optimal choices of Land M` so that the MLMC estimator has complexityO

(ε−2−max(0,γ−β

α )), when γ 6= β

O(ε−2|log ε|2

)otherwise.

c.f. MC: O(ε−2− γα )

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 13 / 21

Multilevel Monte Carlo: summaryFor P ≈ P` and ∆`P = P` − P`−1 with P−1 = 0, we have

E[P ] =∞∑`=0

E[ ∆`P ] ≈L∑`=0

E[ ∆`P ] ≈L∑`=0

1

M`

M∑̀m=1

∆`P(`,m)

where ∆`P(`,m) is the (`,m)’th samples of ∆`P. Assuming

|E[P − P` ]| = O(

2−α`),

V` = Var[ ∆`P ] = O(

2−β`),

W` = O(2γ`),

where the work to sample ∆`P is W`, then there are optimal choices of Land M` so that the MLMC estimator has complexityO

(ε−2−max(0,γ−β

α )), when γ 6= β

O(ε−2|log ε|2

)otherwise.

c.f. MC: O(ε−2− γα )

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 13 / 21

Multilevel Monte Carlo: summaryFor P ≈ P` and ∆`P = P` − P`−1 with P−1 = 0, we have

E[P ] =∞∑`=0

E[ ∆`P ] ≈L∑`=0

E[ ∆`P ] ≈L∑`=0

1

M`

M∑̀m=1

∆`P(`,m)

where ∆`P(`,m) is the (`,m)’th samples of ∆`P. Assuming

|E[P − P` ]| = O(

2−α`),

V` = Var[ ∆`P ] = O(

2−β`),

W` = O(2γ`),

where the work to sample ∆`P is W`, then there are optimal choices of Land M` so that the MLMC estimator has complexityO

(ε−2−max(0,γ−β

α )), when γ 6= β

O(ε−2|log ε|2

)otherwise.

c.f. MC: O(ε−2− γα )

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 13 / 21

Choice of N`: Need for adaptivity

H(E[X |Y ]) ≈ H(XN`(Y )

)

σN−1/2`

E[X |Y1 ]

σ2 = Var[X |Y ]

E[X |Y2 ]

Heaviside H

pdf of XN`

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 14 / 21

Choice of N`: Need for adaptivity

H(E[X |Y ]) ≈ H(XN`(Y )

)

σN−1/2`

E[X |Y1 ]

σ2 = Var[X |Y ]

E[X |Y2 ]

Heaviside H

pdf of XN`

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 14 / 21

MLMC + adaptive inner sampling

Let

d = |E[X |Y ]|, σ2 = Var[X |Y ], δ = d/σ

We will instead use the following number of inner samples:

N` = max(

2`, 4` min(1, (2`δ)−r )), 1 < r < 2,

Note2` ≤ N` ≤ 4`.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 15 / 21

MLMC + adaptive inner sampling

N` = max(

2`, 4` min(1, (2`δ)−r )), 1 < r < 2,

` = 2, r = 1.99

2−` 2−`(r−1)/r

δ = d/σ

N`

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 16 / 21

MLMC + adaptive inner sampling

N` = max(

2`, 4` min(1, (2`δ)−r )), 1 < r < 2,

` = 3, r = 1.99

2−` 2−`(r−1)/r

δ = d/σ

N`

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 16 / 21

MLMC + adaptive inner sampling

N` = max(

2`, 4` min(1, (2`δ)−r )), 1 < r < 2,

` = 4, r = 1.99

2−` 2−`(r−1)/r

δ = d/σ

N`

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 16 / 21

MLMC + adaptive inner sampling

N` = max(

2`, 4` min(1, (2`δ)−r )), 1 < r < 2,

` = 4, r = 1.4

2−` 2−`(r−1)/r

δ = d/σ

N`

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 16 / 21

MLMC + adaptive inner sampling

N` = max(

2`, 4` min(1, (2`δ)−r )), 1 < r < 2,

` = 4, r = 1.2

2−` 2−`(r−1)/r

δ = d/σ

N`

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 16 / 21

MLMC + adaptive inner sampling

N` = max(

2`, 4` min(1, (2`δ)−r )), 1 < r < 2,

` = 4, r = 1.01

2−` 2−`(r−1)/r

δ = d/σ

N`

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 16 / 21

Numerical analysisProblem: In practice δ = d/σ is unknown, so the real adaptive algorithmhas to use Monte Carlo estimates for d̂ and σ̂ to compute N`.

Algorithm: For a given outer sample Y , starting with the minimum,N` = 2`, keep doubling the number of inner samples, N`, until it is largeenough based on current estimate δ̂ = d̂/σ̂, i.e.,

N` ≥ 4` (2`δ̂)−r ,

or it reaches the maximum, 4`.

Concerns:

If we use too many samples, the cost may be larger than we want.

If we use too few samples, the variance may be larger than we want.

The main idea of the analysis is to prove that the probability of ending upwith the “wrong” number of inner samples decays very rapidly as youmove away from the “right” number, that we get if we use the exact δ.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 17 / 21

Numerical analysisProblem: In practice δ = d/σ is unknown, so the real adaptive algorithmhas to use Monte Carlo estimates for d̂ and σ̂ to compute N`.

Algorithm: For a given outer sample Y , starting with the minimum,N` = 2`, keep doubling the number of inner samples, N`, until it is largeenough based on current estimate δ̂ = d̂/σ̂, i.e.,

N` ≥ 4` (2`δ̂)−r ,

or it reaches the maximum, 4`.

Concerns:

If we use too many samples, the cost may be larger than we want.

If we use too few samples, the variance may be larger than we want.

The main idea of the analysis is to prove that the probability of ending upwith the “wrong” number of inner samples decays very rapidly as youmove away from the “right” number, that we get if we use the exact δ.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 17 / 21

Numerical analysisProblem: In practice δ = d/σ is unknown, so the real adaptive algorithmhas to use Monte Carlo estimates for d̂ and σ̂ to compute N`.

Algorithm: For a given outer sample Y , starting with the minimum,N` = 2`, keep doubling the number of inner samples, N`, until it is largeenough based on current estimate δ̂ = d̂/σ̂, i.e.,

N` ≥ 4` (2`δ̂)−r ,

or it reaches the maximum, 4`.

Concerns:

If we use too many samples, the cost may be larger than we want.

If we use too few samples, the variance may be larger than we want.

The main idea of the analysis is to prove that the probability of ending upwith the “wrong” number of inner samples decays very rapidly as youmove away from the “right” number, that we get if we use the exact δ.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 17 / 21

MLMC + adaptive inner sampling

Theorem (main result on output of adaptive algorithm)

Provided

1 the random variable δ = d/σ has bounded density near 0,

2 there exists q > 2 such that

supy

{E[ (|X−E[X |Y ]|

σ

)q ∣∣∣Y = y]}

<∞,

3 and for1 < r < 2−

√4q+1−1

q

then using the adaptive algorithm with this r to compute N` we have

E[N` ] = O(2`) and V` := Var[ ∆`P ] = O(2−`)

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 18 / 21

Other risk measures: Value-at-Risk and Conditional VaR

The Value-at-Risk (VaR), Λη, is defined implicitly by P[ Λ>Λη ] = η.

This can be estimated by a stochastic root-finding algorithm, with theacceptable error ε being steadily reduced during the iteration.

Complexity is O(ε−2|log ε|2).

Given a VaR estimate, Λ̃η, the Conditional VaR (CVaR) is then

E[ Λ | Λ>Λη ] = minx

{x + η−1E[ max(0,Λ−x) ]

}= Λ̃η + η−1E[ max(0,Λ−Λ̃η) ] +O(Λ̃η−Λη)2

= Λ̃η + η−1E[ max(0,E[X |Y ]) ] +O(Λ̃η−Λη)2.

Complexity is O(ε−2).

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 19 / 21

Other risk measures: Value-at-Risk and Conditional VaR

The Value-at-Risk (VaR), Λη, is defined implicitly by P[ Λ>Λη ] = η.

This can be estimated by a stochastic root-finding algorithm, with theacceptable error ε being steadily reduced during the iteration.

Complexity is O(ε−2|log ε|2).

Given a VaR estimate, Λ̃η, the Conditional VaR (CVaR) is then

E[ Λ | Λ>Λη ] = minx

{x + η−1E[ max(0,Λ−x) ]

}= Λ̃η + η−1E[ max(0,Λ−Λ̃η) ] +O(Λ̃η−Λη)2

= Λ̃η + η−1E[ max(0,E[X |Y ]) ] +O(Λ̃η−Λη)2.

Complexity is O(ε−2).

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 19 / 21

Epilogue: key messages

Risk estimation (and nested expectations) is a great new applicationarea for MLMC.

Keys to performance:I MLMC approach with more inner samples on “finer” levels,I adaptive number of inner samples,I sub-sampling to obtain a cost that is independent of the number of

options.

Using an antithetic estimator is possible and improves thecomputational complexity by a constant.

The discussion can be easily extended to terms with heterogeneouswork.

More complicated underlying assets, requiring time discretization, arealso handled using unbiased MLMC (leading to nested MLMC).

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 20 / 21

Epilogue: extensions (in progress)

Risk estimation (and nested expectations) is a great new applicationarea for MLMC.

Keys to performance:I MLMC approach with more inner samples on “finer” levels,I adaptive number of inner samples,I sub-sampling to obtain a cost that is independent of the number of

options.

Using an antithetic estimator is possible and improves thecomputational complexity by a constant.

The discussion can be easily extended to terms with heterogeneouswork.

More complicated underlying assets, requiring time discretization, arealso handled using unbiased MLMC (leading to nested MLMC).

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 20 / 21

References

M.B. Giles and A. Haji-Ali. “Multilevel nested simulation for efficient risk estimation.”SIAM/ASA Journal on Uncertainty Quantification, 7(2):497–525, 2019.

M.B. Giles. “Multilevel Monte Carlo path simulation.” Operations Research,56(3):607–617, 2008.

M.B. Gordy and S. Juneja. “Nested simulation in portfolio risk measurement.”Management Science, 56(10):1833–1848, 2010.

M. Broadie, Y. Du and C.C. Moallemi. “Efficient risk estimationvia nested sequential simulation.” Management Science, 57(6):1172–1194, 2011.

K. Bujok, B.M. Hambly and C. Reisinger. “Multilevel simulation of functionals ofBernoulli random variables with application to basket credit derivatives.” Methodologyand Computing in Applied Probability, 17:579–604, 2015.

M.B. Giles. “Multilevel Monte Carlo methods.” Acta Numerica, 24:259–328, 2015.

W. Gou. “Estimating Value-at-Risk using multilevel Monte Carlo Maximum Entropymethod”. Oxford University MSc thesis, 2016.

Giles (Oxford) MLMC for Risk Estimation 19 July, 2019 21 / 21