simulation (amsi public lecture)

104
Simulation: a ubiquitous tool for statistical computation Simulation: a ubiquitous tool for statistical computation Christian P. Robert Universit´ e Paris-Dauphine, IUF, & CREST http://www.ceremade.dauphine.fr/ ~ xian August 21, 2012

Upload: christian-robert

Post on 25-May-2015

8.874 views

Category:

Education


3 download

DESCRIPTION

These are the slides of my AMSI 2012 Public Lecture.

TRANSCRIPT

Page 1: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulation: a ubiquitous tool for statisticalcomputation

Christian P. Robert

Universite Paris-Dauphine, IUF, & CRESThttp://www.ceremade.dauphine.fr/~xian

August 21, 2012

Page 2: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Outline

Simulation, what’s that for?!

Producing randomness by deterministic means

Monte Carlo principles

Simulated annealing

Page 3: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulation, what’s that for?!

Illustrations

Necessity to “(re)produce chance” on a computer

I Evaluation of the behaviour of a complex system (network,computer program, queue, particle system, atmosphere,epidemics, economic actions, &tc)

[ c© Office of Oceanic and Atmospheric Research]

Page 4: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulation, what’s that for?!

Illustrations

Necessity to “(re)produce chance” on a computer

I Production of changing landscapes, characters, behaviours incomputer games and flight simulators

[ c© guides.ign.com]

Page 5: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulation, what’s that for?!

Illustrations

Necessity to “(re)produce chance” on a computer

I Determine probabilistic properties of a new statisticalprocedure or under an unknown distribution [bootstrap]

(left) Estimation of the cdf F from a normal sample of 100 points;

(right) variation of this estimation over 200 normal samples

Page 6: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulation, what’s that for?!

Illustrations

Necessity to “(re)produce chance” on a computer

I Validation of a probabilistic model

Histogram of 103 variates from a distribution and fit by this distribution density

Page 7: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulation, what’s that for?!

Illustrations

Necessity to “(re)produce chance” on a computer

I Approximation of a integral

[ c© my daughter’s math book]

Page 8: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulation, what’s that for?!

Illustrations

Necessity to “(re)produce chance” on a computer

I Maximisation of a weakly regular function/likelihood

[ c© Dan Rice Sudoku blog]

Page 9: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulation, what’s that for?!

Illustrations

Necessity to “(re)produce chance” on a computer

I Pricing of a complex financial product (exotic options)

Simulation of a Garch(1,1) process and of its volatility (103 time units)

Page 10: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulation, what’s that for?!

Illustrations

Necessity to “(re)produce chance” on a computer

I Handling complex statistical problems by approximateBayesian computation (ABC)

core principle

I Simulate a parameter value (at random) and pseudo-data from thelikelihood until the pseudo-data is “close enough” to the observeddata, then

I keep the corresponding parameter value

[Griffith & al., 1997; Tavare & al., 1999]

Page 11: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulation, what’s that for?!

Illustrations

Necessity to “(re)produce chance” on a computer

I Handling complex statistical problems by approximateBayesian computation (ABC)

demo-genetic inference

Genetic model of evolution from acommon ancestor (MRCA)characterized by a set of parametersthat cover historical, demographic, andgenetic factorsDataset of polymorphism (DNA sample)observed at the present time

97

!""#$%&'()*+,(-*.&(/+0$'"1)()&$/+2!,03 !1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+8-9':*.+

Différents scénarios possibles, choix de scenario par ABC

Le scenario 1a est largement soutenu par rapport aux

autres ! plaide pour une origine commune des

populations pygmées d’Afrique de l’Ouest Verdu et al. 2009

Page 12: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulation, what’s that for?!

Illustrations

Necessity to “(re)produce chance” on a computer

I Handling complex statistical problems by approximateBayesian computation (ABC)

Pygmies population demo-genetics

Pygmies populations: do theyhave a common origin? whenand how did they split fromnon-pygmies populations? werethere more recent interactionsbetween pygmies andnon-pygmies populations?

94

!""#$%&'()*+,(-*.&(/+0$'"1)()&$/+2!,03 !1/+*%*'"4*+56(""4&7()&$/.+.1#+4*.+8-9':*.+

Page 13: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Pseudo-random generator

Pivotal element/building block of simulation: always requiresavailability of uniform U (0, 1) random variables

[ c© MMP World]

Page 14: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Pseudo-random generator

Pivotal element/building block of simulation: always requiresavailability of uniform U (0, 1) random variables

0.13331390.30262990.43429660.23953570.32237230.85311620.39214570.76252590.17019470.2816627...

Page 15: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Pseudo-random generator

Pivotal element/building block of simulation: always requiresavailability of uniform U (0, 1) random variables

Definition (Pseudo-random generator)

A pseudo-random generator is a deterministic function f from ]0, 1[to ]0, 1[ such that, for any starting value u0 and any n, thesequence

{u0, f(u0), f(f(u0)), . . . , fn(u0)}

behaves (statistically) like an iid U (0, 1) sequence

Page 16: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Pseudo-random generator

Pivotal element/building block of simulation: always requiresavailability of uniform U (0, 1) random variables

0.0 0.2 0.4 0.6 0.8

0.0

0.2

0.4

0.6

0.8

x t+1

10 steps (ut , ut+1) of a uniform generator

Page 17: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Philosophical foray

¡Paradox!

While avoiding randomness, the deterministic sequence

(u0, u1 = f(u0), . . . , un = f(un−1))

must resemble a random sequence!

Debate on whether or not true

randomness does exist (Laplace’s

demon versus Schroedinger’s

cat), in which case pseudo

random generators are not

random (von Neuman’s state of

sin)

Page 18: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Philosophical foray

¡Paradox!

While avoiding randomness, the deterministic sequence

(u0, u1 = f(u0), . . . , un = f(un−1))

must resemble a random sequence!

Debate on whether or not true

randomness does exist (Laplace’s

demon versus Schroedinger’s

cat), in which case pseudo

random generators are not

random (von Neuman’s state of

sin)

Page 19: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Philosophical foray

¡Paradox!

While avoiding randomness, the deterministic sequence

(u0, u1 = f(u0), . . . , un = f(un−1))

must resemble a random sequence!

Debate on whether or not true

randomness does exist (Laplace’s

demon versus Schroedinger’s

cat), in which case pseudo

random generators are not

random (von Neuman’s state of

sin)

Page 20: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Philosophical foray

¡Paradox!

While avoiding randomness, the deterministic sequence

(u0, u1 = f(u0), . . . , un = f(un−1))

must resemble a random sequence!

Debate on whether or not true

randomness does exist (Laplace’s

demon versus Schroedinger’s

cat), in which case pseudo

random generators are not

random (von Neuman’s state of

sin)

Page 21: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

True random generators

Intel circuit producing “truly random” numbers:There is no reason physical generators should be“more” random than congruential (deterministic)pseudo-random generators, as those are validgenerators, i.e. their distribution is exactly known(e.g., uniform) and, in the case of parallelgenerations, completely independent

Page 22: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

True random generators

Intel generator satisfies all benchmarks of“randomness” maintained by NIST:Skepticism about physical devices, when comparedwith mathematical functions, because of (a)non-reproducibility and (b) instability of the device,which means that proven uniformity at time t doesnot induce uniformity at time t + 1

Page 23: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

A standard uniform generator

The congruencial generator on {1, 2, . . . ,M}

f(x) = (ax + b) mod (M)

has a period equal to M for proper choices of (a, b) and becomes agenerator on ]0, 1[ when dividing by M + 1

Page 24: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

A standard uniform generator

The congruencial generator on {1, 2, . . . ,M}

f(x) = (ax + b) mod (M)

has a period equal to M for proper choices of (a, b) and becomes agenerator on ]0, 1[ when dividing by M + 1

Example

Takef(x) = (69069069x + 12345) mod (232)

and produce... 518974515 2498053016 1113825472 1109377984 ...i.e.... 0.1208332 0.5816233 0.2593327 0.2582972 ...

Page 25: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

A standard uniform generatorThe congruencial generator on {1, 2, . . . ,M}

f(x) = (ax + b) mod (M)

has a period equal to M for proper choices of (a, b) and becomes agenerator on ]0, 1[ when dividing by M + 1

Page 26: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

A standard uniform generatorThe congruencial generator on {1, 2, . . . ,M}

f(x) = (ax + b) mod (M)

has a period equal to M for proper choices of (a, b) and becomes agenerator on ]0, 1[ when dividing by M + 1

Page 27: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Approximating π

My daughter’s pseudo-code:

N=1000π = 0for I=1,N do

X=RDN(1), Y=RDN(1)if X2 + Y2 < 1 thenπ = π + 1

end ifend forreturn 4*π/N

Page 28: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Approximating π

My daughter’s pseudo-code:

N=1000π = 0for I=1,N do

X=RDN(1), Y=RDN(1)if X2 + Y2 < 1 thenπ = π + 1

end ifend forreturn 4*π/N

●●

● ●

●●

pi = 3.2

100 simulations

Page 29: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Approximating π

My daughter’s pseudo-code:

N=1000π = 0for I=1,N do

X=RDN(1), Y=RDN(1)if X2 + Y2 < 1 thenπ = π + 1

end ifend forreturn 4*π/N

●●

● ●

●●

●●

● ●

● ●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

pi = 3.108

1000 simulations

Page 30: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Approximating π

My daughter’s pseudo-code:

N=1000π = 0for I=1,N do

X=RDN(1), Y=RDN(1)if X2 + Y2 < 1 thenπ = π + 1

end ifend forreturn 4*π/N

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

● ●

● ●

●●

●●

●●

● ●

●●

● ●

● ●

● ●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●● ●

●●

●●

●●

●●

● ●

● ●

●●

● ●

● ●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

● ●

●● ●

● ●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

● ●

● ●●

● ●

●●

●●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

● ●

● ●

● ●

●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

● ●

●●

●●

● ●

●●

●●

●●

● ●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

● ●

●●

●●

● ●

●●

● ●

● ●

●●

●● ●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●●

● ●

●●

●●

●●

●●

●●

●●●

●●

●● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●

●●●

●●

●●

pi = 3.136

10,000 simulations

Page 31: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Approximating π

My daughter’s pseudo-code:

N=1000π = 0for I=1,N do

X=RDN(1), Y=RDN(1)if X2 + Y2 < 1 thenπ = π + 1

end ifend forreturn 4*π/N

106 simulations

Page 32: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Distributions that differ from uniform distributions

Proble

Given a probability distribution withdensity f , how can we producerandomness according to f ?!

I implemented algorithms in aresident software only available forcommon distributions

I new distributions may require fastresolution

I no approximation allowed

xf (

x)

an arbitrary density

Page 33: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Distributions that differ from uniform distributions

Proble

Given a probability distribution withdensity f , how can we producerandomness according to f ?!

I implemented algorithms in aresident software only available forcommon distributions

I new distributions may require fastresolution

I no approximation allowed

xf (

x)

an arbitrary density

Page 34: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

The Accept-Reject Algorithm (remember π?!)

Given a probability distribution with density f , how can we producerandomness according to f ?!

The uniform distribution on thesub-graph region

Sf = {(x , u); 0 ≤ u ≤ f (x)}

produces a marginal distributionin x with density f .(”Fundamental theorem ofsimulation”)

x

f (x)

Page 35: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

The Accept-Reject Algorithm (remember π?!)

Given a probability distribution with density f , how can we producerandomness according to f ?!

In practice, this means we areback to throwing dots on a boxcontaining

Sf = {(x , u); 0 ≤ u ≤ f (x)}

and counting those inside!

x

f (x)

●●

● ●

●●

●●

Page 36: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

The Accept-Reject Algorithm

Refinement of the above through construction of a proper box:

1. Find a hat on f , i.e. a density g that can be simulated andsuch that

supx

f (x)/g(x) = M <∞

2. Generate dots on the subgraph of g , i.e. Y1,Y2, . . . ∼ g , andU1,U2, . . . ∼ U ([0, 1])

3. Accept only the Yk ’s such that

Uk ≤ f (Yk)/Mg(Yk)

Page 37: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

The Accept-Reject Algorithm

Refinement of the above through construction of a proper box:

1. Find a hat on f , i.e. a density g that can be simulated andsuch that

supx

f (x)/g(x) = M <∞

2. Generate dots on the subgraph of g , i.e. Y1,Y2, . . . ∼ g , andU1,U2, . . . ∼ U ([0, 1])

3. Accept only the Yk ’s such that

Uk ≤ f (Yk)/Mg(Yk)

Page 38: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

The Accept-Reject Algorithm

Refinement of the above through construction of a proper box:

1. Find a hat on f , i.e. a density g that can be simulated andsuch that

supx

f (x)/g(x) = M <∞

2. Generate dots on the subgraph of g , i.e. Y1,Y2, . . . ∼ g , andU1,U2, . . . ∼ U ([0, 1])

3. Accept only the Yk ’s such that

Uk ≤ f (Yk)/Mg(Yk)

Page 39: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

The Accept-Reject AlgorithmRefinement of the above through construction of a proper box:

x

f (x)

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●●

●●

●●

●●

●●

●●

● ●

●●

● ●

●●

●●

●●

●●

●●

●●

● ●

●●

●●

●●

●●

●●

●●

●●

●●

● ●

● ●

●●

●●●

●●

●●

●●

●●

Page 40: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Properties

I Does not require normalisation constant of f

I Does not require an exact upper bound M

I Requires on average M Yk ’s for one simulated X (efficiencymeasure)

Page 41: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Properties

I Does not require normalisation constant of f

I Does not require an exact upper bound M

I Requires on average M Yk ’s for one simulated X (efficiencymeasure)

Page 42: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Properties

I Does not require normalisation constant of f

I Does not require an exact upper bound M

I Requires on average M Yk ’s for one simulated X (efficiencymeasure)

Page 43: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Slice sampler

There exists other ways of exploitating the fundamental idea ofsimulation over the subgraph:

Page 44: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Slice sampler

There exists other ways of exploitating the fundamental idea ofsimulation over the subgraph:

If direct uniform simulation on

Sf = {(u, x); 0 ≤ u ≤ f (x)}

is too complex [because of unavailable hat] use instead a randomwalk on Sf :

Page 45: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Slice sampler

There exists other ways of exploitating the fundamental idea ofsimulation over the subgraph:

this means doing random jumps in vertical then horizontaldirection, accounting for the boundaries

I 0 ≤ u ≤ f (x), i.e. U(0, 1)

I f (x) ≥ u, i.e. x ∼ US(u)

Page 46: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Slice sampler

Slice sampler algorithm

For t = 1, . . . ,Twhen at (x (t), ω(t)) simulate

1. ω(t+1) ∼ U[0,f (x(t))]

2. x (t+1) ∼ US(t+1) , where

S(t+1) = {y ; f (y) ≥ ω(t+1)}.

Page 47: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Slice sampler

Slice sampler algorithm

For t = 1, . . . ,Twhen at (x (t), ω(t)) simulate

1. ω(t+1) ∼ U[0,f (x(t))]

2. x (t+1) ∼ US(t+1) , where

S(t+1) = {y ; f (y) ≥ ω(t+1)}.

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Page 48: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Slice sampler

Slice sampler algorithm

For t = 1, . . . ,Twhen at (x (t), ω(t)) simulate

1. ω(t+1) ∼ U[0,f (x(t))]

2. x (t+1) ∼ US(t+1) , where

S(t+1) = {y ; f (y) ≥ ω(t+1)}.

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Page 49: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Slice sampler

Slice sampler algorithm

For t = 1, . . . ,Twhen at (x (t), ω(t)) simulate

1. ω(t+1) ∼ U[0,f (x(t))]

2. x (t+1) ∼ US(t+1) , where

S(t+1) = {y ; f (y) ≥ ω(t+1)}.

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Page 50: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Slice sampler

Slice sampler algorithm

For t = 1, . . . ,Twhen at (x (t), ω(t)) simulate

1. ω(t+1) ∼ U[0,f (x(t))]

2. x (t+1) ∼ US(t+1) , where

S(t+1) = {y ; f (y) ≥ ω(t+1)}.

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Page 51: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Slice sampler

Slice sampler algorithm

For t = 1, . . . ,Twhen at (x (t), ω(t)) simulate

1. ω(t+1) ∼ U[0,f (x(t))]

2. x (t+1) ∼ US(t+1) , where

S(t+1) = {y ; f (y) ≥ ω(t+1)}.

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Page 52: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Slice sampler

Slice sampler algorithm

For t = 1, . . . ,Twhen at (x (t), ω(t)) simulate

1. ω(t+1) ∼ U[0,f (x(t))]

2. x (t+1) ∼ US(t+1) , where

S(t+1) = {y ; f (y) ≥ ω(t+1)}.

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Page 53: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Slice sampler

Slice sampler algorithm

For t = 1, . . . ,Twhen at (x (t), ω(t)) simulate

1. ω(t+1) ∼ U[0,f (x(t))]

2. x (t+1) ∼ US(t+1) , where

S(t+1) = {y ; f (y) ≥ ω(t+1)}.

0.0 0.2 0.4 0.6 0.8 1.0

0.00

00.

002

0.00

40.

006

0.00

80.

010

x

y

Page 54: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

The Metropolis–Hastings algorithm

Further generalisation of the fundamental idea to situations whenthe slice sampler cannot be easily implemented

Page 55: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

The Metropolis–Hastings algorithm

Further generalisation of the fundamental idea to situations whenthe slice sampler cannot be easily implemented

Idea

Create a sequence (Xn) such that, for n ‘large enough’, the densityof Xn is close to f , only using a ‘local’ knowledge of f ...

This is the domain of Markoviansimulation methods: there is a Markovdependence between the Xn’s

Andreı Markov

Page 56: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

The Metropolis–Hastings algorithm

Further generalisation of the fundamental idea to situations whenthe slice sampler cannot be easily implemented

Idea

Create a sequence (Xn) such that, for n ‘large enough’, the densityof Xn is close to f , only using a ‘local’ knowledge of f ...

This is the domain of Markoviansimulation methods: there is a Markovdependence between the Xn’s

Markov bar, Melbourne

Page 57: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

The Metropolis–Hastings algorithm

Further generalisation of the fundamental idea to situations whenthe slice sampler cannot be easily implemented‘local’ exploration can produce ‘global’ vision:

[ c© Civilization 5]

Page 58: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

The Metropolis–Hastings algorithm (2)

If f is the density of interest, we pick a proposal conditional density

q(y |xn)

such that

I it connects with the current value xnI it is easy to simulate

I it is positive everywhere f is positive

Page 59: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

The Metropolis–Hastings algorithm (2)

If f is the density of interest, we pick a proposal conditional density

q(y |xn)

such that

I it connects with the current value xnI it is easy to simulate

I it is positive everywhere f is positive

Page 60: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Random walk Metropolis–Hastings

ProposalYt = Xn + εn,

where εn ∼ g , independent from Xn, and g symmetrical

Page 61: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Random walk Metropolis–Hastings

ProposalYt = Xn + εn,

where εn ∼ g , independent from Xn, and g symmetrical

Motivation

local perturbation of Xn / myopic exploration of its neighbourhood

Yn accepted or rejected depending on the relative values of f (Xn)and f (Yn)

Page 62: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Random walk Metropolis–HastingsProposal

Yt = Xn + εn,

where εn ∼ g , independent from Xn, and g symmetrical

Page 63: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Random walk Metropolis–Hastings

Resulting algorithm

Random walk Metropolis–Hastings

Starting from X (t) = x (t)

1. Generate Yt ∼ g(y − x (t))

2. Take

X (t+1) =

Yt with proba. min

{1,

f (Yt)

f (x (t))

},

x (t) otherwise

Page 64: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Properties

I Always accepts higher point and sometimes lower points(similar to gradient algorithms)

I Depends on the dispersion of g

I Average probability of acceptance

% =

∫ ∫min{f (x), f (y)}g(y − x) dxdy

I close to 1 if g has a small variance [Danger!]I far from 1 if g has a large variance [Re-Danger!]

Page 65: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Properties

I Always accepts higher point and sometimes lower points(similar to gradient algorithms)

I Depends on the dispersion of g

I Average probability of acceptance

% =

∫ ∫min{f (x), f (y)}g(y − x) dxdy

I close to 1 if g has a small variance [Danger!]I far from 1 if g has a large variance [Re-Danger!]

Page 66: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Properties

I Always accepts higher point and sometimes lower points(similar to gradient algorithms)

I Depends on the dispersion of g

I Average probability of acceptance

% =

∫ ∫min{f (x), f (y)}g(y − x) dxdy

I close to 1 if g has a small variance [Danger!]I far from 1 if g has a large variance [Re-Danger!]

Page 67: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Properties

scale=1

Page 68: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Properties

scale=2

Page 69: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Producing randomness by deterministic means

Properties

scale=3

Page 70: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Monte Carlo principles

General purpose of Monte Carlo methods

Given a probability density f known up to a normalizing constant,f (x) ∝ f (x), and an integrable function h, compute

I(h) =

∫h(x)f (x)dx =

∫h(x)f (x)dx∫f (x)dx

when∫h(x)f (x)dx is intractable.

[Remember π!!]

Page 71: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Monte Carlo principles

General purpose of Monte Carlo methods

Given a probability density f known up to a normalizing constant,f (x) ∝ f (x), and an integrable function h, compute

I(h) =

∫h(x)f (x)dx =

∫h(x)f (x)dx∫f (x)dx

when∫h(x)f (x)dx is intractable.

[Remember π!!]

Page 72: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Monte Carlo principles

Monte Carlo basics

Generate a pseudo-randomsample x1, . . . , xN from f andestimate I(h) by

IMCN (h) = N−1

N∑i=1

h(xi ).

and let N grow to infinity...

Page 73: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Monte Carlo principles

Monte Carlo basics

Generate a pseudo-randomsample x1, . . . , xN from f andestimate I(h) by

IMCN (h) = N−1

N∑i=1

h(xi ).

and let N grow to infinity...

A. Doucet, C. Andrieu, X, A. Philippe, J. Rosenthal, E. Moulines

Page 74: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Monte Carlo principles

Monte Carlo basics

Generate a pseudo-randomsample x1, . . . , xN from f andestimate I(h) by

IMCN (h) = N−1

N∑i=1

h(xi ).

and let N grow to infinity...

Page 75: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Monte Carlo principles

Monte Carlo basics

Generate a pseudo-randomsample x1, . . . , xN from f andestimate I(h) by

IMCN (h) = N−1

N∑i=1

h(xi ).

and let N grow to infinity...

Caveat

Often impossible or inefficient to simulate directly from f

Page 76: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Monte Carlo principles

Importance Sampling

For proposal distribution with density q(x), alternativerepresentations

I(h) =

∫h(x){f /q}(x)q(x)dx

Principle

Generate an iid sample x1, . . . , xN ∼ q and estimate I(h) by

I ISq,N(h) = N−1N∑i=1

h(xi ){f /q}(xi ).

Page 77: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Monte Carlo principles

Importance Sampling

For proposal distribution with density q(x), alternativerepresentations

I(h) =

∫h(x){f /q}(x)q(x)dx

Principle

Generate an iid sample x1, . . . , xN ∼ q and estimate I(h) by

I ISq,N(h) = N−1N∑i=1

h(xi ){f /q}(xi ).

Page 78: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Optimisation problems

A (genuine) puzzle

During a dinner with 20 couples sitting at four tables with tenseats, everyone wants to share a table with everyone. Theassembly decides to switch seats after each serving towards thisgoal. What is the minimal number of servings needed to ensurethat every couple shared a table with every other couple at somepoint? And what is the optimal switching strategy?

[http://xianblog.wordpress.com/2012/04/12/not-le-monde-puzzle/]

solution to the puzzle

Page 79: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Stochastic minimisation

Example (Egg-box function)

Consider the function

h(x , y) = (x sin(20y) + y sin(20x))2 cosh(sin(10x)x)

+ (x cos(10y)− y sin(10x))2 cosh(cos(20y)y) ,

to be minimised.

Page 80: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Stochastic minimisation

Example (Egg-box function)

Consider the function

h(x , y) = (x sin(20y) + y sin(20x))2 cosh(sin(10x)x)

+ (x cos(10y)− y sin(10x))2 cosh(cos(20y)y) ,

to be minimised. (I knowthat the global minimum is 0for (x , y) = (0, 0).)

Page 81: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Stochastic minimisation

Example (Egg-box function)

Consider the function

h(x , y) = (x sin(20y) + y sin(20x))2 cosh(sin(10x)x)

+ (x cos(10y)− y sin(10x))2 cosh(cos(20y)y) ,

to be minimised. (I knowthat the global minimum is 0for (x , y) = (0, 0).)

Page 82: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Stochastic minimisation

Example (egg-box function (2))

Instead of solving the first order equations

∂h(x , y)

∂x= 0 ,

∂h(x , y)

∂y= 0

and of checking that the second order conditions are met, we cangenerate a random sequence in R2

θj+1 = θj +αj

2βj∆h(θj , βjζj) ζj

Page 83: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Stochastic minimisation

Example (egg-box function (2))

we can generate a random sequence in R2

θj+1 = θj +αj

2βj∆h(θj , βjζj) ζj

where

� the ζj ’s are uniform on the unit circle x2 + y2 = 1;

� ∆h(θ, ζ) = h(θ + ζ)− h(θ − ζ);

� (αj) and (βj) converge to 0

Page 84: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Calibration parameterschoice 1 2 3 4

αj 1/ log(j + 1) 1/100 log(j + 1) 1/(j + 1) 1/(j + 1)βj 1/ log(j + 1).1 1/ log(j + 1).1 1/(j + 1).5 1/(j + 1).1

I α ↓ 0 slowly,∑

j αj =∞I β ↓ 0 more slowly,∑

j(αj/βj)2 <∞

I Scenarios 1-2: not enoughenergy

I Scenarios 3-4: good

[ c©George Casella 1951–2012]

Page 85: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

The traveling salesman problem

A classical allocation problem:

I Salesman who needs tovisit n cities in a row

I Traveling costs betweenpairs of cities are known

I Search of the optimalcircuit

Aboriginal art, NGA, Melbourne

Page 86: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

The traveling salesman problem

A classical allocation problem:

I Salesman who needs tovisit n cities in a row

I Traveling costs betweenpairs of cities are known

I Search of the optimalcircuit

Aboriginal art, NGA, Melbourne

Page 87: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

The traveling salesman problem

A classical allocation problem:

I Salesman who needs tovisit n cities in a row

I Traveling costs betweenpairs of cities are known

I Search of the optimalcircuit

Procter & Gamble competition, 1962

Page 88: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

An NP-complete problem

The traveling salesmanproblem is an example ofmathematical problems thatrequire explosive resolutiontimes

Aboriginal art, NGA, Melbourne

Page 89: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

An NP-complete problem

The traveling salesmanproblem is an example ofmathematical problems thatrequire explosive resolutiontimesNumber of possible circuits n!and exact solutions availablein O(2n) time

Exact solution for 15, 112 German cities

found in 2001 in 22.6 CPU years.

Page 90: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

An NP-complete problem

The traveling salesmanproblem is an example ofmathematical problems thatrequire explosive resolutiontimesNumerous practicalconsequences (networks,integrated circuit design,genomic sequencing, &tc.)

Exact solution for the 24, 978 Sweedish cities

found in 2004 in 84.8 CPU years.

Page 91: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Resolution via simulation

The simulated annealing algorithm:

I name is borrowed from metallurgy

I metal manufactured by a slowdecrease of temperature(annealing) stronger than whenmanufactured by fast decrease

[ c©Joachim Robert, ca. 2006]

Page 92: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Resolution via simulation

Repeat

I Random modifications of parts of the original circuit with costC0

I Evaluation of the cost C of the new circuit

I Acceptation of the new circuit with probability

min

{exp

{C0 − C

T

}, 1

}[Metropolis et al., 1953]

Page 93: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Resolution via simulation

Repeat

I Random modifications of parts of the original circuit with costC0

I Evaluation of the cost C of the new circuit

I Acceptation of the new circuit with probability

min

{exp

{C0 − C

T

}, 1

}T , temperature, is progressively reduced

[Metropolis et al., 1953]

Page 94: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

A family meeting (1)

Recall the table puzzle

Definition of a target function

Set the story of the 20 couple tables during the 6 courses as a(20, 6) matrix, e.g.,

S =

1 1 3 2 4 22 1 3 3 2 4...

......

......

...3 2 2 2 4 4

and define a penalty function as the number of missed encounters

Page 95: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

A family meeting (1)

Recall the table puzzle

Definition of a target function

I=sample(rep(1:4,5))

for (i in 2:6)

I=cbind(I,sample(rep(1:4,5)))

meet=function(I){

M=outer(I[,1],I[,1],"==")

for (i in 2:6)

M=M+outer(I[,i],I[,i],"==")

M

}

penalty=function(M){ sum(M==0) }

penat=penalty(meet(I))

Page 96: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

A family meeting (2)

Random switch of couples

I Pick two couples [among the 20 couples] at random withprobabilities proportional to the number of other couples theyhave not seenprob=apply(meet(I)==0,1,sum)

I switch their respective position during one of the 6 courses

I accept the switch with Metropolis–Hastings probabilitylog(runif(1))<(penalty(old)-penalty(new))/gamma

Page 97: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

A family meeting (2)

Random switch of couples

For instance, propose to replace

S =

1 1 3 2 4 22 1 3 3 2 4...

......

......

...3 2 2 2 4 4

with S ′ =

1 1 3 2 2 22 1 3 3 4 4...

......

......

...3 2 2 2 4 4

Page 98: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

A family meeting (2)

Random switch of couples

for (t in 1:N){

prop=sample(1:20,2,prob=apply(meet(I)==0,1,sum))

cour=sample(1:6,1)

Ip=I

Ip[prop[1],cour]=I[prop[2],cour]

Ip[prop[2],cour]=I[prop[1],cour]

penatp=penalty(meet(Ip))

if (log(runif(1))<(penat-penatp)/gamma){

I=Ip

penat=penatp}

}

Page 99: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

A family meeting (3)

Solution

I

[1,] 1 4 3 2 2 3

[2,] 1 2 4 3 4 4

[3,] 3 2 1 4 1 3

[4,] 1 2 3 1 1 1

[5,] 4 2 4 2 3 3

[6,] 2 4 1 2 4 1

[7,] 4 3 1 1 2 4

[8,] 1 3 2 4 3 1

[9,] 3 3 3 3 4 3

[10,] 4 4 2 3 1 1

[11,] 1 1 1 3 3 2

[12,] 3 4 4 1 3 2

[13,] 4 1 3 4 4 2

[14,] 2 4 3 4 3 4

[15,] 2 3 4 2 1 2

[16,] 2 2 2 3 2 2

[17,] 2 1 2 1 4 3

[18,] 4 3 1 1 2 4

[19,] 3 1 4 4 2 1

[20,] 3 1 2 2 1 4[ c© http://www.metrolyrics.com]

Page 100: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Solving sudokus

Solving a Sudoku grid as a minimisation problem:

[ c© Dan Rice Sudoku blog]

Page 101: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Solving sudokus

Given a partly filled Sudoku grid (with a single solution),I define a random Sudoku grid by filling the empty slots at

randoms=matrix(0,ncol=9,nrow=9)

s[1,c(1,6,7)]=c(8,1,2)

s[2,c(2:3)]=c(7,5)

s[3,c(5,8,9)]=c(5,6,4)

s[4,c(3,9)]=c(7,6)

s[5,c(1,4)]=c(9,7)

s[6,c(1,2,6,8,9)]=c(5,2,9,4,7)

s[7,c(1:3)]=c(2,3,1)

s[8,c(3,5,7,9)]=c(6,2,1,9)

Page 102: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Solving sudokus

Given a partly filled Sudoku grid (with a single solution),

I define a random Sudoku grid by filling the empty slots atrandom

I define a penalty function corresponding to the number ofmissed constraints#local score

scor=function(i,s){

a=((i-1)%%9)+1

b=trunc((i-1)/9)

boxa=3*trunc((a-1)/3)+1

boxb=3*trunc(b/3)+1

return(sum(s[i]==s[9*b+(1:9)])+

sum(s[i]==s[a+9*(0:8)])+

sum(s[i]==s[boxa:(boxa+2),boxb:(boxb+2)])-3)

}

Page 103: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Solving sudokus

Given a partly filled Sudoku grid (with a single solution),

I define a random Sudoku grid by filling the empty slots atrandom

I define a penalty function corresponding to the number ofmissed constraints

I fill “deterministic” slotsI make simulated annealing moves

# random moves on the sites

i=sample((1:81)[as.vector(s)==0],sample(1:sum(s==0),1,pro=1/(1:sum(s==0))))

for (r in 1:length(i))

prop[i[r]]=sample((1:9)[pool[i[r]+81*(0:8)]],1)

if (log(runif(1))/lcur<tarcur-target(prop)){

nchange=nchange+(tarcur>target(prop))

cur=prop

points(t,tarcur,col="forestgreen",cex=.3,pch=19)

tarcur=target(cur)

}

Page 104: Simulation (AMSI Public Lecture)

Simulation: a ubiquitous tool for statistical computation

Simulated annealing

Solving sudokus

I many possible variants in theproposals

I rather slow and sometimesreally slow

I may get stuck on a penalty of2 (and never reach zero)

I does not compete at all withnonrandom solvers