automatic numeric inversion - boun.edu.tr
TRANSCRIPT
Automatic Numeric InversionWolfgang Hormann, Gerhard Derflinger, Josef Leydold
IE-Department Bogazici University IstanbulDepartment of Statistics, WU Wien
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.0/41
Non-Uniform Random Variate Generation
Aim:Generate a sequence Xi of IID random variates with givendistribution.
Solution:Transform sequence Ui of IID U(0, 1) random numbers intosequence Xi.
U1, U2, U3, U4, . . . −→ X1, X2, X3, . . .
For inversion this transformation is one-to-one.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.1/41
Inversion Method
Theorem:Let F(x) be a CDF of the given distribution.If U is a U(0, 1) random number, then
X = F−1(U) = inf {x : F(x) ≥ U}
is a random variate with CDF F.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.2/41
Inversion Method
Required: (Inverse of) CDF F of Distribution.
1
U
X
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.3/41
Inversion Method
Required: (Inverse of) CDF F of Distribution.
1
U
X
U ∼ U(0, 1)
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.3/41
Inversion Method
Required: (Inverse of) CDF F of Distribution.
1
U
X
U ∼ U(0, 1) −→ X = F−1(U)
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.3/41
Inversion Method // Example
Exponential distribution:
F(x) = 1 − exp(−x)
F−1(u) = −log(1 − u)
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.4/41
Inversion Method // Algorithm
Required: Inverse of CDF
Generate U ∼ U(0, 1).Compute X = F−1(U).Return X.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.5/41
Inversion Method // Algorithm
Required: Inverse of CDF
Generate U ∼ U(0, 1).Compute X = F−1(U).Return X.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.5/41
Inversion Method // Algorithm
Required: Inverse of CDF
Generate U ∼ U(0, 1).Compute X = F−1(U).Return X.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.5/41
Inversion Method // Algorithm
Required: Inverse of CDF
Generate U ∼ U(0, 1).Compute X = F−1(U). ⇐ Problem (?)Return X.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.5/41
Inversion Method // Advantages
The most general method for generating non-uniformrandom variates.Works for all distributions provided that the CDF isgiven.
Get one random variate X for each uniform U.
Preserves the structural properties of the underlyinguniform PRNG.Easy to use for QMC.Required for Copula generation.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.6/41
Inversion Method // Advantages
Consequently . . .
Can be used for variance reduction techniques.(common / antithetic variates, stratified sampling, . . . )
Sampling from truncated distributions.
Quality of generated random numbers depends only onthe underlying uniform PRNG, not on distribution.
Hence preferred method for MC.
Often considered as the only possible alternative for QMC.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.7/41
Inversion Method // Disadvantages
CDF and its inverse often not given in closed form(or unknown).
Algorithms based on well-known numerical methods(e.g. Newton’s method or regula falsi) are slow and canonly be speeded up by the usage of (large) tables.
Numerical methods are not exact, i.e., they producerandom numbers which are only approximatelydistributed as the given distribution.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.8/41
Automatic Methods
Automatic algorithms have been developed fornon-standard distributions.
Also called black-box or universal methods.
Idea:One algorithm works for a large class of distributions.
Today these algorithms have properties that makes themalso attractive for generating from standard distributions.Even for sampling from Gaussian distributions.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.9/41
Automatic Inversion Method // Why ?
Reasons for development:
In many simulation situations the application ofstandard distributions is not adequate.Development of inversion methods for specialdistribution too “expensive”.
New Automatic Inversion Algorithm:
User provides the PDF and one point in the domain.Set-up calculates all necessary tables and thus"designs" the algorithm.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.10/41
Automatic Inversion Method // Why ?
Reasons for development:
In many simulation situations the application ofstandard distributions is not adequate.Development of inversion methods for specialdistribution too “expensive”.
New Automatic Inversion Algorithm:
User provides the PDF and one point in the domain.Set-up calculates all necessary tables and thus"designs" the algorithm.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.10/41
Basics of Numerical Inversion
Devroye’s Notion of “Exact R.V.G. Algorithms”:
Inversion only possible for F−1(u) available in closed form.
Practically all MC or QMC experiments use 64-bit floatingpoint numbers.
We call an algorithm “exact” if its precision is close tomachine precision.
Problem: Definition of the approximation error
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.11/41
Definition of Approximation Error
Without inverting the CDF we can only control the “u-error”of the approximate inverse CDF denoted by F−1
a (x).
εu(u) = |u − F(F−1a (u))|
The “x-error”, i.e., εx(u) = |F−1(u) − F−1a (u)| can be quite
large in the tails as
εx(u) = εu(u)/f(x) + O(εu(u)2)
u-error useful for bound of “F-discrepancy”.
Small u-error enough for MC and QMC applications.
For exact quantiles in the far tails the x-error is necessary!
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.12/41
Requirements for the new Algorithm
Maximal accepted u-error can be selected by the userMaximal u-errors close to machine precision can bereached with medium-sized tablesSampling algorithm (very) fast
To reach the above we accept:
Slow set-upMedium-sized tables
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.13/41
Main Ingredients of Numerical Inversion
Numeric Integration to obtain the CDF F(x):Gauss-Lobatto integrationNumerical approximation of F−1(u):Newton interpolationDecomposition into many subintervals:indexed search to find the subintervalImportant Synergies:We use the same subintervals for integration andinterpolation.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.14/41
Numerical integration
As Quadrature rule we use: Gauss-Lobatto(5)
(similar to Gauss-Legendre(4) but uses the intervalendpoints as nodes)
Error bound for Lobatto(5) for interval of length ∆:
integrationerror ≤ 7.03 · 10−10 ∆9 maxξ
f(8)(ξ)
Thus: Numeric integration is no problem forsmooth PDF f(x) with bounded 8th derivative.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.15/41
CDF
Start with a point x0 in the far left tail and selectsubinterval borders xi.Calculate
∫xi+1
xif(x)dx with Gauss-Lobatto(5).
Store the CDF values of xi.
To evaluate the CDF at x:
Find the maximal i with xi ≤ x
Use Gauss-Lobatto(5) to evaluate∫x
xif(t)dt..
Return F(x) = F(xi) +∫x
xif(t)dt.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.16/41
Lobatto(5) Integration error on (0, x)
0.0 0.1 0.2 0.3 0.4 0.5
−8e−
12−6
e−12
−4e−
12−2
e−12
0e+0
0
Lobato 5 Integration error for PDF gamma(2) on (0,x)
0.0 0.1 0.2 0.3 0.4 0.5
−0.0
015
−0.0
010
−0.0
005
0.00
00
Lobato 5 Integration error for PDF gamma(1.5) on (0,x)
Gamma(2), f(8) bounded Gamma(1.5), f(8) unbounded
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.17/41
Polynomial Interpolation
Idea:Approximate function g(x) on some interval [x0, xn] bypolynomial p(x) such that
g(xi) = p(xi) for x0 < x1 < · · · < xn and i = 0, . . . , n.
Newton Interpolation uses a numerical stablerepresentation of the polynomial.
Note that we use x0 and xn as borders of the subintervals toavoid discontinuities.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.18/41
Newton Interpolation
For smooth functions the error of the Newton interpolationis:
|g(x)−p(x)| =g(n+1)(ξ)
(n + 1)!
n∏
i=0
(x−xi) for x and ξ in (x0, xn).
Thus the approximation error is O((xn − x0)n+1) and can be
made as small es desired by using shorter subintervals.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.19/41
Interpolation of the Inverse CDF
We can evaluate the CDF F(x) using numerical integration.Thus it is easy to specify the interval and constructionpoints xi and ui = F(xi).Use the pairs (ui, xi) for Newton interpolation of F−1
We have two alternatives for point selection:use the “rescaled” Chebyshev points to select xi.Advantage: SimpleDisadvantage: Interploation may be poor in the tails.use the “rescaled” Chebyshev points for ui.Advantage: optimal interpolationDisadvantage: numerical inversion in set-up
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.20/41
F−1-Normal distribution, equidistant xi
0.980 0.985 0.990 0.995
2.0
2.2
2.4
2.6
2.8
3.0
n=5, N(0,1) on (2,3) xi equidistant
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.21/41
F−1-Normal distribution, Chebyshev xi
0.980 0.985 0.990 0.995
2.0
2.2
2.4
2.6
2.8
3.0
n=5, N(0,1) on (2,3) xi Chebyshev
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.22/41
F−1-Normal distribution, equidistant ui
0.980 0.985 0.990 0.995
2.0
2.2
2.4
2.6
2.8
3.0
n=5, N(0,1) on (2,3) ui Chebyshev
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.23/41
u-error: Normal distribution, equidistant xi
0.978 0.980 0.982 0.984 0.986 0.988
−2.5
e−07
−1.5
e−07
−5.0
e−08
5.0e
−08
n=5, N(0,1) on (2,2.25) xi equidistant
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.24/41
u-error: Normal distribution, Chebyshev xi
0.978 0.980 0.982 0.984 0.986 0.988
−1.5
e−07
−5.0
e−08
5.0e
−08
1.0e
−07
n=5, N(0,1) on (2,2.25) xi Chebyshev
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.25/41
u-error: Normal distribution, equidistant ui
0.978 0.980 0.982 0.984 0.986 0.988
−6e−
08−2
e−08
2e−0
86e
−08
n=5, N(0,1) on (2,2.25) ui Chebyshev
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.26/41
Error for interpolation of inverse CDF
For a Newton polynomial of order n the error bound for thex-error is:
|F−1(u) − F−1a (u)| ≤
maxv((F−1(v))(n+1))
(n + 1)!
n∏
i=0
(u − ui)
Thus the x-error depends on the n-th derivative of 1/f(x)
and is large for the tails.
For short intervals: maximal error attained for u ≈ ui thelocal extrema of∏n
i=0(u − ui).We use:
u-error ≈ maxi
|ui − F(F−1a (ui))|
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.27/41
Polynomial Inversion Algo. // Preprocessing
User input: PDF (need not integrate to 1),point xs, maximal accepted u-error εu.
Assumption: PDF is smooth, support of PDF is an interval.Otherwise: Decomposition in such intervals known.
Steps before constructing the subintervals:Find a < b with f(a) and f(b) < f(xs) · 10
−8.
Calculate approximate I(a,b) =∫b
a f(x)dx numerically.Find cut-off values a, b with tail areas smaller thanI(a,b) εu 0.1.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.28/41
The area below the density
Standard method for numerically integrating long intervals:
Half the intervals, till the total integral does not change anymore.
Gauss-Lobatto well suited as just 6 new points have to beevaluated for both new sub-intervals .
(Gauss-Legendre(4) requires 8 new points.)
It is easy to implement the above idea recursively. Thusonly sub-intervals which still are not integrated exact aresplit.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.29/41
Cut-off Values
The "local concavity" lcf(x) of a density is defined as:
lcf(x) = 1 −f ′′(x) f(x)
f ′(x)2
Assuming that lcf(x) is constant in the far tails of thedistribution we get the approximate tail integral:
T(x) ≈f(x)2
f ′(x)(1 + lcf(x))
The first and the second derivative are estimated usingcentered differences. Then T(x) = εu 0.1 can be solved by aroot-finding algorithm.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.30/41
Polynomial Inversion Algorithm // Set-up
[Function Sub-Interval (x0, xn)]
Select the xi for i = 1, 2, . . . , n.Using n Lobatto(5) integrations calculate the ui = F(xi)
for i = 1 to n.Calculate the coefficents of the Newton Interpolation forthe pairs (ui, xi), i = 0, 1, . . . , n.Calculate the control-points for the u-error.If the u-error at all control-points is a bit smaller than εu
return xn.otherwise retry with smaller/larger value for xn.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.31/41
Polynomial Inversion Algorithm // Set-up Cont.
Use a, b and εu = I · εu from the preprocessing.
[Loop]Set x0 ← a, step← (b − a)/128, j← 0.Repeat while x < b.
x← Sub-Interval(xj, xj + step)
Set step← x − xj, j← j + 1, xj ← x.Store Coefficients of the Newton Polynomials for allsubintervals.Store cumulative probabilities of the subintervals.Make guide table.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.32/41
Polynomial Inversion Algorithm // Sampling
[Generator]
Generate U ∼ U(0, 1).Use indexed search to calculate index J of thecorresponding subinterval.Evaluate the Newton polynomial of interval J at U tocalculate X.Return X.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.33/41
Checking the Approximation
For Inversion Algorithm statistical test not necessary.
Need a (possibly slow) implementation of the CDF(not using any parts of our program!)
Check the u-error for many values of u in (0, 1).
Especially values close to 0 and 1 are important to check.
Also important to control that the algorithm returns a and b
for u = 0 and u = 1.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.34/41
Test Results
We checked our implementation for:Normal, Gamma, and t distributionsMany different parametersFor n = 3 to 11.For εu = 10−6, 10−8, 10−10, 10−12.For 3 · 106 different u-values, many of them in the tails.
Results:For εu not smaller than 10−10 the maximal observed u-errorwas always below εu.
For 10−12 some moderately larger u-errors were observed.Probably error too close to machine precision.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.35/41
u-error for normal distribution
0.0 0.2 0.4 0.6 0.8 1.0
0e+0
02e
−11
4e−1
16e
−11
8e−1
1
ue=1.e−10, n=5
uval
uerro
r
εu = 10−10, n = 5.Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.36/41
Number of Intervals: Linear Interpolation
distribution εu = 10−6 εu = 10−8 εu = 10−10 εu = 10−12
Linear InterpolationNormal 1063 11533 117875 -Cauchy 1849 17491 185335 -Exponential 1012 10406 101959 -Gamma(5) 1072 11225 109336 -Gamma( 1
2) 1546 15432 154291 -
Beta(2,2) 823 8009 88179 -Beta(0.3,3) 1884 18783 187786 -
1001/2 = 10
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.37/41
Number of Intervals: Newton Interpolation
distribution εu = 10−6 εu = 10−8 εu = 10−10 εu = 10−12
n = 3
Normal 58 171 521 1616Cauchy 103 281 820 2512Exponential 41 121 371 1145Gamma(5) 62 177 530 1636
1001/4 = 3.16
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.38/41
Number of Intervals: Newton Interpolation
distribution εu = 10−6 εu = 10−8 εu = 10−10 εu = 10−12
n = 5
Normal 31 61 123 253Cauchy 56 103 197 388Exponential 20 38 75 156Gamma(5) 33 73 135 265Gamma(1.6) 29 57 118 240Hyperbolic(ζ = 1) 33 65 130 267Hyperbolic(ζ = 0.1) 35 68 136 281
1001/6 = 2.15
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.39/41
New Algorithm // Advantages
Requires only the (unnormed) PDF and no CDF.Accuracy can be easily checked. We can thus use themaximal accepted approximation error εu as parameterof the algorithm.
There is not need to compute F−1(u) for any u.Only moderate number of subintervals requiredVery fast.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.40/41
New Algorithm // Disadvantages
Have to cut off tails of distributions with unboundeddomains.
We only can control the “u-error”, i.e.,εu(u) = |u − F(F−1
a (u))|.The “x-error”, i.e., εx(u) = |F−1(u) − F−1
a (u)| can be quitelarge in the tails.Performance depends on floating point numbers used.For future new standards higher order polynomials maybe necessary.
Note:The floating point standard has not changed for 20 years.
Hormann & Derflinger & Leydold– MCQMC-2008/07/11 – Automatic Numeric Inversion – p.41/41