statistics, data analysis, and simulation – ss 2013 · statistics, data analysis, and simulation...

16
Mainz, May 2, 2013 Statistics, Data Analysis, and Simulation – SS 2013 08.128.730 Statistik, Datenanalyse und Simulation Dr. Michael O. Distler <[email protected]>

Upload: truongtu

Post on 29-Sep-2018

238 views

Category:

Documents


0 download

TRANSCRIPT

Mainz, May 2, 2013

Statistics, Data Analysis, andSimulation – SS 2013

08.128.730 Statistik, Datenanalyse undSimulation

Dr. Michael O. Distler<[email protected]>

2. Random Numbers

2.1 Why random numbers:

SimulationSamplingNumerical analysisComputer programmingDecision makingCryptographyAestheticsRecreation

2.2 Number representation

unsigned integersigned integerfloating-point format

float (32-bit)double (64-bit)

2.3 Random Number Generators

1927 L.H.C. Tippett published 40,000 random digits“taken at random from census reports”

1939 M.G. Kendall and B. Babington-Smith produceda table of 100,000 random digitsusing a mechanical random number generator

1946 John von Neumann first suggested the “middle-square”method. His idea was to take the square of the previousrandom number and to extract the middle digits;for example, if we are generating 10-digit numbers:rj+1 = (r2

j div 100,000)mod 10,000,000,000r0 = 5772156649, r2

0 = 33317 7923805949︸ ︷︷ ︸r1=7923805949

09201

Von Neumann’s original “middle-square method” has actuallyproved to be a comparatively poor source of random numbers.The danger is that the sequence tends to get into a rut, a shortcycle of repeating elements. For example, if zero ever appearsas a number of the sequence, it will continually perpetuateitself.

Random Number Generators

There is another fairly obvious objection to the “middle-square”technique: how can a sequence generated in such a way berandom, since each number is completely determined by itspredecessor? The answer is that this sequence isn’t random,but it appears to be.Sequences generated in a deterministic way such as this areusually called pseudo-random or quasi-random sequences inthe technical literature, but here we will make a distinction:

pseudo-random numbers should pass the samestatistical tests as true random numbers doquasi-random numbers or sequences can betransformed like random numbers but should only be usedfor MC integration.

Xj+1 = f (Xj ,Xj−1, . . . ,X1)

2.3.1 The Linear Congruential Method

By far the most popular random number generators in usetoday are special cases of the following scheme, introduced byD.H. Lehmer in 1949.We choose four “magic numbers”:

m, the modulus; 0 < m.a, the multiplier; 0 ≤ a < m.c, the increment; 0 ≤ c < m.

X0, the starting value; 0 ≤ X0 < m.The desired sequence of random numbers is then obtained bysetting

Xn+1 = (a Xn + c)mod m, n ≥ 0

For example, the sequence obtained when m = 10 andX0 = a = c = 7 is

7,6,9,0,7,6,9,0, . . .

The Linear Congruential Method Generator (LCG)

The following theorem makes it easy to tell if the maximumperiod is achieved. The proof can be found in: Donald E.Knuth: The Art of Computer Programming, Vol. 2

Theorem: The linear congruential sequence defined by m, a,c, and X0 has period length m if and only if

i) c is relatively prime to m; (“teilerfremd”)ii) b = a− 1 is a multiple of p, for every prime p dividing m;iii) b is a multiple of 4, if m is a multiple of 4.

Try m = 16, a = 5, c = 3, X0 = 0:

Xn+1 = (5 Xn + 3)mod 16

LCGs parameters in common use

Source m a c output bitsNumerical Recipes 232 1664525 1013904223Borland C/C++ 232 22695477 1 bits 30..16 in rand(),

bits 30..0 in lrand()glibc (used by GCC) 231 1103515245 12345 bits 30..0ANSI C: Watcom, . . . 231 1103515245 12345 bits 30..16Borland Delphi, Virtual Pascal 232 134775813 1 bits 63..32 of (seed * L)Microsoft Visual/Quick C/C++ 232 214013 2531011 bits 30..16

(343FD16) (269EC316)Microsoft Visual Basic (≤v6) 224 1140671485 12820163

(43FD43FD16) (C39EC316)MMIX by Donald Knuth 264 6364136223846793005

1442695040888963407VAX’s MTH$RANDOM,old versions of glibc 232 69069 1Java’s java.util.Random 248 25214903917 11 bits 47...16LC53 in Forth 232 − 5 232 - 333333333 0

Source: Wikipedia

2.3.2 Multiplicative congruential method

If c = 0, the generator is often called a multiplicativecongruential generator, or Lehmer RNG.

Xn+1 = (a Xn)mod m

Advantage: faster algorithmDisadvantage: no Zero, possibly shorter period

Definition: When a is relatively prime to m, the smallest integerλ for which aλ mod m = 1 is conventionally called the order of amodulo m. Any such value of a that has the maximum possibleorder modulo m is called a primitive element modulo m.

Multiplicative congruential generators

Let λ(m) denote the order of a primitive element, i.e., themaximum possible order, modulo m.

λ(2) = 1λ(4) = 2λ(2e) = 2e−2, e > 2λ(pe) = pe−1 (p − 1), prime p > 2

Multiplicative congruential generators

Theorem: The number a is a primitive element modulo pe ifand only if

1 pe = 2, a is odd;or pe = 4, a mod 4 = 3;or pe = 8, a mod 8 = 3,5,7;or p = 2, e ≥ 4, a mod 8 = 3,5;

2 p is odd, e = 1, a 6= 0(modulo p), anda(p−1)/q 6= 1(modulo p) for any prime divisor q of p − 1;

3 p is odd, e > 1, a satisfies (2), and a(p−1)/q 6= 1(modulo p2)

Multiplicative congruential generators

Theorem: The maximum period possible when c = 0 is λ(m).This period is achieved if

1 X0 is relatively prime to m;2 a is a primitive element modulo m

Note that we can obtain a period of length m − 1 if m is prime;this is just one less than the maximum length, so for all practicalpurposes such a period is as long as we want.

2.3.3 Combination of multible MLCGs

Techniques for producing numbers from variousdistributions

The inverse transform sampling method:

Let X be a random variable whose distribution can bedescribed by the cumulative distribution function F .We want to generate values of X which are distributedaccording to this distribution.

The inverse transform sampling method works as follows:

1 Generate a random number u from the standard uniformdistribution in the interval [0,1].

2 Compute the value x such that F (x) = u.3 Take x to be the random number drawn from the

distribution described by F.

Inverse transform sampling

Generator gives:

0 ≤ Xn < m ↪→ 0 ≤ Xn

m′< 1

Uniform distribution: U(0,1)

Transformation:

f (x)dx = U(0,1)du

CDF:∫ x

−∞f (t)dt = F (x) = u

x = F (−1)(u)

Acceptance-Rejection Method

Suppose we want to generate samples from a density f definedon some set X.

Let g be a density on X from which we know how to generatesamples and with the property that

f (x) ≤ cg(x)

for some constant c.

1 generate X from distribution g.2 generate U from U(0,1).3 If (U ≤ f (X)

cg(x))

return Xotherwise

go to Step 1.The acceptance-rejection method for sampling from density fuses candidates from density g.