david mackay’s wooden blocks - stellenbosch university€¦ · 1 john skilling...
TRANSCRIPT
1
John Skilling ([email protected]) MaxEnt2016 Ghent
David MacKay’s wooden blocks
This talk is dedicated to the memory
of Professor Sir David MacKay FRS.
2
How many “crystal” arrangements are there?
⌦12 = 53060477521960000
⌦4 = 36
⌦2 = 2
⌦30 = 131841545472244027406496188757912375363891696443221279694626947912188459956437700105571773334900360294912000000 ⇡ 10110
Asymptotically, expect ⌦ to be multiplicative,
so log(⌦) will be additive: ⌦ ⇠ exp(size).
Get ⌦ ⇡ exp(0.5829⇥ 12n
2).
⌦(A [B) ⇠ ⌦(A)⌦(B)
⌦(A) ⌦(B)
David wants his son wants to pack
12n
2blocks of size 2⇥1 into a square n⇥n box.
3
Start with states of n⇥n model.
Each of N = 2n(n�1) bonds can be ON or OFF.
Define valency V = # ON bonds at node.
Define energy
E =
X
nodes
|V � 1|
Start with unrestricted bonds, 2
Npossibilities.
. . . Compress . . .
End with E=0 crystals, ⌦n possibilities ⌧ 2
N.
Get ⌦ by computing
proportion X = ⌦n/2N
of crystals.
(X⇤ = proportion of states with E E⇤ :
the computer does not “know” about size 2
N.)
0
4
3
2
2 2
11
1
11
1 1 1
1
0
11
1
11
1 1 1
1
1 1
1 11
1
1
16777216 models
36 crystals
Strategy
4
Energy histogram for 4⇥4 box.
36 crystals
16777216 models
Compression by ÷12 million for 4⇥4.
Compression needs guidance — control by energy.
Use random samples, limited by energy.
# states Entropy Information
Models: 2
N= 16777216 S = log(2
N) I = 0
Crystals: ⌦ = 36 S = log⌦ I = log(2
N/⌦)
5
0
4
3
2
2 2
11
1
11
1 1 1
1
0 ON
0
4
3
2
2 2
11
11
1 1 1
1
1 2OFF
02
2 2
11
11
1 1 1
1
1
3
2
2
OFFON
MCMC exploration
Best moves usually small and simple. Here, random bonds were invertedON/OFF.
6
0
4
3
2
2 2
11
1
11
1 1 1
1
0 ON
0
4
3
2
2 2
11
11
1 1 1
1
1 2OFF
02
2 2
11
11
1 1 1
1
1
3
2
2
OFFON
MCMC exploration
Best moves usually small and simple. Here, random bonds were invertedON/OFF.
Random model Ordered crystal
E = 148 E = 48 E = 2 E = 0Dislocations
Program tested up to 2000⇥2000 (2 million blocks and 2
(8 million)
model states).
Program limited by annihilation of last pair of dislocations.
For each crystal, last pair could be in O(N2
) places:
“Flagpole in Atlantic” problem — solutions wanted!
7
First (random) sample at E1 = 14.
Proportion u1 ⇠ Uniform(0, 1) lies inside.
X1 = u1
A compressive run
8
First (random) sample at E1 = 14.
Proportion u1 ⇠ Uniform(0, 1) lies inside.
X1 = u1
Discard E > 14.
A compressive run
9
First (random) sample at E1 = 14.
Proportion u1 ⇠ Uniform(0, 1) lies inside.
X1 = u1
Second (E 14) sample at E2 = 10.
Proportion u2 ⇠ Uniform(0, 1) lies inside.
X2 = X1u2
Discard E > 14.
A compressive run
10
First (random) sample at E1 = 14.
Proportion u1 ⇠ Uniform(0, 1) lies inside.
X1 = u1
Second (E 14) sample at E2 = 10.
Proportion u2 ⇠ Uniform(0, 1) lies inside. Discard E > 10.
X2 = X1u2
Discard E > 14.
Discard E > 10.
A compressive run
11
First (random) sample at E1 = 14.
Proportion u1 ⇠ Uniform(0, 1) lies inside.
X1 = u1
Second (E 14) sample at E2 = 10.
Proportion u2 ⇠ Uniform(0, 1) lies inside. Discard E > 10.
Third (E 10) sample at E3 = 8.
Proportion u3 ⇠ Uniform(0, 1) lies inside.
X3 = X2u3
X2 = X1u2
First (random) sample at E1 = 14.
Proportion u1 ⇠ Uniform(0, 1) lies inside.
X1 = u1
Second (E 14) sample at E2 = 10.
Proportion u2 ⇠ Uniform(0, 1) lies inside. Discard E > 10.
X2 = X1u2
Discard E > 14.
Discard E > 10.
A compressive run
12
First (random) sample at E1 = 14.
Proportion u1 ⇠ Uniform(0, 1) lies inside.
X1 = u1
Discard E > 14.
Second (E 14) sample at E2 = 10.
Proportion u2 ⇠ Uniform(0, 1) lies inside. Discard E > 10.
Third (E 10) sample at E3 = 8.
Proportion u3 ⇠ Uniform(0, 1) lies inside.
X3 = X2u3
X2 = X1u2
Discard E > 8.
A compressive run
13
X3 = X2u3 and X 03 = X2u0
3
Compression reaches
ˆX3 = X2 min(u3, u03)
Next (E 8) sample also at E3 = 8.
Another proportion u03 ⇠ Uniform(0, 1) lies inside.
Discard E > 8.
14
X3 = X2u3 and X 03 = X2u0
3
Compression reaches
ˆX3 = X2 min(u3, u03)
Next (E 8) sample also at E3 = 8.
Another proportion u03 ⇠ Uniform(0, 1) lies inside.
Another (E 8) sample at E3 = 8.
Yet another proportion u003 ⇠ Uniform(0, 1) lies inside.
X3 = X2u3 and X 03 = X2u0
3 and X 003 = X2u00
3
Compression reaches
ˆX3 = X2 min(u3, u03, u
003)
Discard E > 8.
Discard E > 8.
15
X3 = X2u3 and X 03 = X2u0
3
Compression reaches
ˆX3 = X2 min(u3, u03)
Next (E 8) sample also at E3 = 8.
Another proportion u03 ⇠ Uniform(0, 1) lies inside.
Another (E 8) sample at E3 = 8.
Yet another proportion u003 ⇠ Uniform(0, 1) lies inside.
X3 = X2u3 and X 03 = X2u0
3 and X 003 = X2u00
3
Compression reaches
ˆX3 = X2 min(u3, u03, u
003)
Finally, a sample reaches lower to E = 6.
Proportion u4 ⇠ Uniform(0, 1) lies inside.
X4 = X̂3u4
Discard E > 8.
Discard E > 8.
Discard E > 8.
16
X3 = X2u3 and X 03 = X2u0
3
Compression reaches
ˆX3 = X2 min(u3, u03)
Next (E 8) sample also at E3 = 8.
Another proportion u03 ⇠ Uniform(0, 1) lies inside.
Another (E 8) sample at E3 = 8.
Yet another proportion u003 ⇠ Uniform(0, 1) lies inside.
X3 = X2u3 and X 03 = X2u0
3 and X 003 = X2u00
3
Compression reaches
ˆX3 = X2 min(u3, u03, u
003)
Finally, a sample reaches lower to E = 6.
Proportion u4 ⇠ Uniform(0, 1) lies inside.
X4 = X̂3u4Discard E > 6.
Discard E > 8.
Discard E > 8.
17
X3 = X2u3 and X 03 = X2u0
3
Compression reaches
ˆX3 = X2 min(u3, u03)
Next (E 8) sample also at E3 = 8.
Another proportion u03 ⇠ Uniform(0, 1) lies inside.
Another (E 8) sample at E3 = 8.
Yet another proportion u003 ⇠ Uniform(0, 1) lies inside.
X3 = X2u3 and X 03 = X2u0
3 and X 003 = X2u00
3
Compression reaches
ˆX3 = X2 min(u3, u03, u
003)
Finally, a sample reaches lower to E = 6.
Proportion u4 ⇠ Uniform(0, 1) lies inside.
X4 = X̂3u4
More samples arrive at E = 6.
Discard E > 8.
Discard E > 8.
Discard E > 6.
Compression reaches
ˆX4 =
ˆX3 min(u4, u04, . . . , u
00004 )
18
Complete run:
71 2112
3
1 1
1 sample at E=14: X = u1
1 sample at E=10: X = u1u2
3 samples at E=8: X = u1u2 min(u3, u03, u
003| {z }
3
)
12 samples at E=6: X = u1u2 min(u3, u03, u
003| {z }
3
)min(u4, u04, . . . , u
00004| {z }
12
)
21 samples at E=4: X = u1u2 min(u3, u03, u
003| {z }
3
)min(u4, u04, . . . , u
00004| {z }
12
)min(u5, u05, . . . , u
00005| {z }
21
)
71 samples at E=2: X = u1u2 min(u3, u03, u
003| {z }
3
)min(u4, u04, . . . , u
00004| {z }
12
)min(u5, u05, . . . , u
00005| {z }
21
)min(u6, u06, . . . , u
00006| {z }
71
)
1 sample at E=0: end
Can infer X by simulation (every u ⇠ Uniform(0, 1)),, but prob(X) is badly skew (X ⌧ 1 but X 6< 0).
logmin(u, u0, . . . , u0000)| {z }
k
) =
kX
i=1
1
i±
kX
i=1
1
i2
!1/2
Use logX instead, which is additive so has meaningful moments.
Get logX = �15.43± 2.86.
Truth is X = 36/16777216 = exp(�13.05) = 2.15⇥ 10
�6. X Without logs, X = (3± 13)⇥ 10
�6. ⇥
19
1 sample at E=14: logX = �1
1 sample at E=10: logX = �2
3 samples at E=8: logX = �3.83
12 samples at E=6: logX = �6.94
21 samples at E=4: logX = �10.58
71 samples at E=2: logX = �15.43
1 sample at E=0: end
0 2 4 6 8 10 12 14 16
E
logX
0
�5
�10
�15
••
•
•
•
•
Density of states
E1
E2
E3
E4
E5
E6
20
1 sample at E=14: logX = �1
1 sample at E=10: logX = �2
3 samples at E=8: logX = �3.83
12 samples at E=6: logX = �6.94
21 samples at E=4: logX = �10.58
71 samples at E=2: logX = �15.43
1 sample at E=0: end
0 2 4 6 8 10 12 14 16
E
logX
0
�5
�10
�15
••
•
•
•
•
�X1
�X2
�X3
�X4
�X5
�X6
E1
E2
E3
E4
E5
E6
Associate each E with its associated compressive range �X.
g(E) = proportion of models per unit energy =
X
i
�(E � Ei)�Xi
= “density of states”
Density of states
We get full relationship between E and X, not just final compression Xend.
This compressive algorithm is called nested sampling.
21
For “model” and “crystal”, read “prior” and “posterior”.
For “energy �E”, read “� logL” where L is likelihood.
For “partition function Z”, read “evidence Z”.
For “X”, read “enclosed proportion of prior”.
Bayesian inference is the elementary special case of unit temperature.
Nested sampling gets sequence (X1, L1), (X2, L2), . . .
Prior⇥ Likelihood = Evidence⇥ Posterior
Bayes
Posterior Pr(i) = Li �Xi/Z.Evidence Z =P
Li �Xi.
22
Physics
Physical applications have meaningful energy and often involve temperature T = 1/�.
Occupancies are modulated proportionally to e��E , and physical properties follow.
Partition function: Z =
Ze��Eg(E)dE evaluated as Z =
Xe��Ei
�Xi
Internal energy: U =
RE e��Eg(E)dERe��Eg(E)dE
⌘ hEi� evaluated as U =
PEi e��Ei�XiPe��Ei�Xi
Specific heat: C =dU
dTetc.
Note that
R. . . e��EdX is a Laplace transform which smooths the operand.
micro-property(E) ����� Laplace
transform
������������! Macro-property(T )
signal + noise| {z }Energy
����� Laplace
transform
������������! smooth property| {z }Temperature
�����Thermal
algorithms
������������! signal + noise| {z }Temperature
= noisy property| {z }Temperature
Contrast thermal algorithms (annealing) which control on temperature.
Nested sampling controls on energy, which is the right way round.
23
Finale
Wooden blocks would make a great student assignment.
The problem is simply understood and visual, but captures the essence of a real problem.
It’s concerned merely with counting, with no confusion about meaning or philosophy.
Exact results are available for small sizes.
The exercise is reasonably challenging.
It’s open-ended (di↵erent exploration engines, algorithm e�ciency, 3 dimensions, . . . )
The student who can do this will be well placed to program Bayesian inference professionally.
And David would surely have approved with enthusiasm.